Over the years, working as a data modeler and database architect, I have noticed that there are a couple rules that should be followed during data modeling and development. Here I describe some tips in the hope that they might help you. I have listed the tips in the order that they occur during the project lifecycle rather than listing them by importance or by how common they are.
The Protocols and Structures for Inference (PSI) project has developed an architecture for presenting machine learning algorithms, their inputs (data) and outputs (predictors) as resource-oriented RESTful web services in order to make machine learning technology accessible to a broader range of people than just machine learning researchers.
Data Science Studio (DSS) from Dataiku is a complete Data Science software tool for developers and analysts, which significantly shortens the time-consuming load-clean-train-test-deploy cycles of building predictive applications. A community edition and a free trial available.
Feature selection techniques have become an apparent need in many bioinformatics applications. In addition to the large pool of techniques that have already been developed in the machine learning and data mining fields, specific applications in bioinformatics have led to a wealth of newly proposed techniques.