Welcome to the Optibrium Community





Forgot login?
Register

FAQs

Search


What modelling techniques can I use?

Tuesday, 29 September 2009 21:14
E-mail Print PDF
Nick Foster
The following techniques can be used to build models in StarDrop:
  • Partial Least Squares - This is a robust technique for generation of linear models based on multiple descriptors
  • Radial Basis Functions - This unique approach is an efficient numerical technique to generate a non-linear function that ‘maps’ the property variations in a descriptor space. A more computationally intensive version of this approach can also be used where a genetic algorithm initially searches for the optimal descriptor space in which to describe those variations.
  • Gaussian Processes - This powerful ‘machine learning’ technique generates robust non-linear models using a Bayesian statistical approach.
  • Decision Trees - This is a recursive partitioning approach to building classification models in cases where good ‘continuous’ data are not available.
  • Random Forests - This is an ensemble method that makes predictions based on the output of a collection of random trees.

What modelling technique(s) will the Auto-Modeller use?

Tuesday, 29 September 2009 21:14
E-mail Print PDF
Nick Foster

All the appropriate methods (listed above) are automatically applied simultaneously to each modeling problem and the results are rigorously compared to identify the best model of your data.


Can my colleagues use the model I generate?

Tuesday, 29 September 2009 21:13
E-mail Print PDF
Ed Champness

Yes, once a model has been built, you can easily share it or ‘publish’ it on a model server for all StarDrop users to access. This enables new models to be rapidly applied to the design of new chemistry.


Why do I need to split my data set?

Tuesday, 29 September 2009 21:13
E-mail Print PDF
Ed Champness

The data set of molecules and associated property values that have been experimentally determined are split into three sub-sets; a training set used to train individual models using each modeling technique, a validation set used to compare the performance of each model in order to select the best, and an independent test set used to confirm the predictive power of the final model. This provides for the development of a robust model.






Latest Forums

Read more >