top of page

Datachemical LAB Case Study Interview
From the research and education fields


Professor Tatsuya Oshima
×
Shogo Yoshimaru
Professor, Chemistry and Life Science Program, Faculty of Engineering, University of Miyazaki
Data Chemical Co., Ltd. Representative Director and CEO

background
"No-code x Materials Informatics" spreading to research and education
-Using AI in educational programs: University of Miyazaki's challenge to develop students' ability to think with data-
Materials informatics (MI), which is essential for accelerating material development, is now spreading not only in industry but also in education.
However, MI education requires high hurdles, such as specialized knowledge and programming (such as Python), and introducing it into the curriculum is not easy.
In this interview, Professor Tatsuya Oshima of the Faculty of Engineering at the University of Miyazaki will discuss a case in which he used the no-code MI tool "Datachemical LAB" in his own laboratory and university lectures, allowing students to "set their own problems and verify hypotheses using AI."
Practical education that develops the ability to "think with data" - what is the background, real results, and outlook for the future? Be sure to take a look at the cutting edge of digital transformation in education.


Cast introduction

Performers
Professor Tatsuya Oshima
Professor, Chemistry and Life Science Program, Faculty of Engineering, University of Miyazaki

interviewer
Shogo Yoshimaru
Data Chemical Co., Ltd. Representative Director and CEO
At the intersection of experimental science and data science: Discussing the current state of research and education

table of contents
What prompted me to start using machine learning in my research?
Student-led machine learning expands through Datachemical LAB
The results seen in the field of education and the reality of data utilization
Experimental Science and Machine Learning: How to Deal with "Data Shortage"
How to deal with AI and machine learning: The attitude students need
1. What prompted you to start using machine learning in your research?
Yoshimaru (host)
Thank you for taking the time to speak with us today. It's been about three years since your university introduced the Datachemistry LAB. First, I'd like to ask you about how you've used it in your research.
Professor Oshima
My research is related to the solvent extraction of metal chloride complexes, and while traditional research has tended to focus on the design of metal complexes, I realized that my research subject is a system in which the physical properties of the solvent itself are important parameters in metal extraction, and from there I began to focus on collecting and organizing data on solvents.
At first, we used solubility parameters to qualitatively determine extraction ability, but we thought we could go a step further and wondered if we could quantitatively predict extraction rates.
Yoshimaru
That's when you started working on machine learning.
Professor Oshima
Yes. However, at the time, the premise was that code had to be written in Python, and since no one in the lab, including myself, had an information-related background, it was difficult to continue doing this on an ongoing basis. But then Datachemical LAB was released, and we thought, "This will work." From there, the use of machine learning in the lab spread rapidly.

2. Student-led machine learning expands through the use of Datachemical LAB
Yoshimaru
What changes have you seen since Datachemical LAB was introduced?
Professor Oshima
The biggest advantage is that no programming is required. I think this removes a huge hurdle for students. For chemistry students, learning Python from scratch and building a model is a big hurdle. But with Datachemical LAB, even those without prior knowledge can get started.
As a result, going back to Python is no longer an option for us. Our policy is that if we want to do machine learning, we should do it at Datachemical LAB.
Yoshimaru
Does it also lead to success?
Professor Oshima
The first model we worked on, a prediction model for extraction rates, achieved an accuracy of around 0.94 coefficient of determination, but now the amount of data handled by students has increased and the verification methods have changed. We are now able to build highly accurate models with coefficients of determination of 0.98 to 0.99.

3. Results seen in the educational field and the reality of data utilization
Yoshimaru
We have concluded a partnership agreement with your university for the purpose of collaborating in the field of education. As part of that, Datachemical LAB was actually used in education this year.
Professor Oshima
Yes. The University of Miyazaki's Faculty of Engineering is currently working on digital transformation education, and this time they used Datachemical LAB in their experimental exercises. Previously, they had considered having students write code in Python, but due to significant time and technical constraints, this was not realistic.
Yoshimaru
So Datachemical LAB can cover that.
Professor Oshima
Yes, this time the theme was "predicting the aqueous solubility of organic compounds," and the students experienced everything from model building to verification. For many students, it seemed like a completely different experience from regular student experiments.
In particular, the ability to instantly obtain data in CSV format and the ability to extract and predict features from structural formulas seemed fresh to the students, who seemed surprised that such things were possible.
[Exercise content]
Prediction of logS (water solubility) of organic compounds
Using the structures (SMILES notation) and water solubility data of 800 existing organic compounds as learning data, we performed machine learning predictions using the following steps.
① Descriptor calculation : 208 molecular descriptors are generated using the RDKit from the structures of organic compounds in the training data.
② Preprocessing : Select features, remove unnecessary descriptors, and organize the data into one suitable for analysis.
3) Data visualization : Understanding data features using heat maps of correlation coefficients and scatter plots
%20%E4%BA%88%E6%B8%AC.jpg)
④ Model building and prediction : Build a predictive model using the training data and predict the logS of the compound each participant selected as their research subject.
⑤ Verification : Compare the predicted results with literature values and consider the relationship with the structure.

Experimental exercise using Datachemical LAB
4. Experimental Science and Machine Learning: How to Deal with "Data Shortage"
Yoshimaru
What do you think are the challenges in using machine learning in research and education?
Professor Oshima
The amount of data is an issue. Experimental research is generally limited in the amount of data available. Compared to high-throughput research that collects tens of thousands of pieces of data, the conditions are inevitably stricter.
Our search for solvents for metal extraction is a topic where we can collect a relatively large amount of data through solvent screening, so it is well suited to machine learning. So now, we are constantly thinking about what topics we can realistically collect data on and that are suitable for modeling.
Yoshimaru
It seems that determining how much data can be collected based on the characteristics of the topic and the reality of the research is important. I believe that Datachemical LAB can make a contribution precisely in such "fields where data is limited." We have made repeated improvements to various functions to enable highly accurate predictions even with small amounts of data. We hope to continue collaborating with research and education sites to expand the possibilities in this field.
5. How to deal with AI and machine learning: The attitude students need
Yoshimaru
Finally, do you have any words of advice for students who are planning to study data science?
Professor Oshima
First of all, don't rely too much on AI or machine learning. In this student experiment, although the prediction accuracy was good, the model was not perfectly consistent. But that's okay. Rather, it's important to experience that "sometimes things don't work out."
Yoshimaru
I see, so that's what it means to master it.
Professor Oshima
I think so. Machine learning is an extension of statistics, so results aren't everything; how you think about them is important. As AI develops and data science becomes more important, it will be essential to proactively incorporate statistical methods rather than relying solely on traditional theoretical scientific approaches. That's why I want students to have the mindset of "mastering AI." It is certainly a powerful weapon if used well, but it shouldn't be used to overwhelm you. I want them to keep that "sense of distance" and take an active role in their work.
In the future, we also hope to work on developing more practical packaged teaching materials with a view to using them in educational settings.
Yoshimaru
I think this is a wonderful initiative. We would like to provide more opportunities for Datachemistry LAB to be used more widely in university education. We would love to cooperate with the teacher's efforts to develop teaching materials.
Professor Oshima
Please do.
Yoshimaru
Thank you very much. We were greatly inspired by your sincere approach to research and education.
Professor Oshima
Thank you very much.

bottom of page