Thesis Opportunities

Here we have exciting opportunities to work with SoftRobot. During the internship, you will be supervised by one of our machine learning engineers and work with state-of-the-art tools in NLP and CV. This is a great way to see how AI is used on a practical level to provide insights to end-users. With us, you will collaborate as a team, contribute to the development discussion, and peer into the DevOps infrastructure.

Who You Are

We are looking for passionate and enthusiastic students who are constantly learning in the fields of computer science and AI. We encourage curiosity and flexibility with where you take your ideas and how you transform them into something valuable. We require that our interns have an understanding of Python. Some knowledge of SQL and the Python frameworks/libraries within the scope of AI (e.g. Scikit-learn, Pytorch, Keras, Tensorflow, Huggingface) is a plus.

Thesis Suggestions

1. Decomposition, Refinement, and Alignment in RAG-based Prompt Engineering

This research aims to explore the intricacies of prompt engineering for Large Language Models (LLMs) within a Retrieval-Augmented Generation (RAG) architecture. You'll investigate how prompts can be decomposed to target specific LLM responses, refine these prompts for better accuracy and relevance, and examine the difference between LLM response output and human intention. The overarching objective is to optimize prompt engineering, ensuring improved output quality in a RAG context. This will be research that will go towards a product that is intended for production.

2. Visually-Rich Document Understanding with Multimodal Transformers

We would like to investigate and fine-tune a pretrained model for one of two use cases:

  • Document Classification. An investigation of ways to finetune a pre-trained multimodal transformer to perform classification of visually rich documents. A challenging aspect of this is that while sometimes documents from different classes are clearly discernible both visually and semantically, sometimes there is very little difference between them. A potential part of this project could be to find a method for determining situations where the algorithm is likely to be correct and vice versa.
  • Domain (Economic) Specific Predictions. We are interested to see to what degree a pre-trained multi modal transformer can be fine-tuned to link visually rich documents to a domain specific classification/clustering, possibly on row/sentence level. A possible extension of this project is to then link these predicted results to monetary data found in the document.

3. Data Augmentation for Visually-Rich Documents

We would like to investigate the feasibility of constructing synthetic financial documents with advanced language models (e.g. BERT, GANs, GPT) as valuable input for other services where data is scarce. This research could be applied to a few products at the company.

Apply

When you apply, we ask you to attach a resume and cover letter to contact@softrobot.io. Please include the thesis suggestion you choose in the cover letter and why it seems interesting to you. The application review process will be conducted in the late fall.