We tried using GenAI to screen pitch decks, here is how it went

February 20, 2024

•

min read

https:///insights/we-tried-using-genai-to-screen-pitch-decks-here-is-how-it-went

Like many investors, the Catalyst Fund receives a lot of pitch decks from startups looking to raise funding. Since January 2023, the Catalyst Fund has received and reviewed over 3,000 pitch decks — more than 50,000 pages of information — from climate tech innovators seeking investment for their ventures. Wading through those decks can take up the better part of an investment associate's time.

Generative AI (GenAI), with its analytical and reviewing capabilities, holds promise to help investment professionals in the first steps of their analysis, by screening prospective pitch decks. Using GenAI for deck review offers some advantages, such as consistency and automation, but to date, investors have limited experience using such tools.

Catalyst Fund took GenAI for a spin to screen decks as part of an effort to landscape climate tech innovations in Africa (published here). We found that, although GenAI promises to save time and effort in the long run, using it to screen decks is yet to be straightforward or autonomous. In this piece, we share some thoughts on the potential of GenAI for investors, and the prompts and tools we used in case others want to replicate the approach.

GenAI may one day make deck reviews faster and more objective

In theory, there are many advantages to using GenAI tools to screen pitch decks that can address human bias, error, and even time limitations.

Relative to humans who have moods and tempers, GenAI is uniform and consistent. This will mean that decks are reviewed in the exact same way regardless of the time of day, the length of the deck, its color scheme, or an associate’s expertise. Whereas human investment associates might have variable personal interests or academic backgrounds, GenAI will invariably follow the instructions provided in a prompt so that decks are screened in a uniform fashion.

AI tools are also faster than human associates. Once the tools are trained and the decks converted to text, AI analysis is swift, taking less than five minutes to analyze hundreds of decks. When working well, this approach could remove the burden of repetitive work for associates and reduce the chances of missing good decks due to analyst fatigue. However, the opposite risk also exists: that GenAI discards decks an associate might have found reason to take a chance on.

With almost zero marginal costs to running GenAI, investment funds using such tools would be able to assess many more decks with the same human resources. This should expand the pool of startups under consideration and improve the chances of identifying winners.

Furthermore, revisiting decks with new prompts is also nearly costless. As investors edit and adjust their theses, they can easily rerun analyses on their full databases with minimal effort. Similarly, founders can continuously update their pitch decks without worrying about annoying or creating additional work for investment teams.

Given this potential, GenAI may soon be incorporated by investment teams more broadly as solutions such as Alteryx or Knime are already being used to summarize documents automatically. In fact, we found a few solutions already in the market for desk review: DeckMatch and Nemo, and few others incorporating other GenAI use cases such as Cap and Decile Base. VC Lab has also published a list of the best artificial intelligence (AI) tools for venture capital professionals, from general tools to pipeline and LP management.

Still, most investment teams currently rely on cloud-based tools such as Google Workspace and Office 365 for document management. Although these suites provide integrated genAI workflows such as Duet AI and Copilot, they focus on tasks such as content creation and meeting summaries. There are yet to offer screening or assessment capabilities.

Challenges to using AI, for now

We set out to screen numerous pitch decks as part of a landscaping exercise. We hypothesized that the screening exercise might lead to additional use cases for GenAI in our work. We found that GenAI tools we trained were able to tag and summarize well, but that the information provided wasn’t sophisticated enough to suggest other uses for the time being. With further training and development, that may change.

GenAI allowed us to systematically collate information about startups from their decks. That gave us a good overview of the pipeline, but not much to distinguish startups from each other. But the summaries it created were insufficient to assess if solutions adequately address problems or if traction is real. Still, even at this level of analysis, GenAI can help outline characteristics of the pipeline, identifying gaps in reach and coverage. For example, we were largely able to tell what percentage of startups in our pipeline included female founders.

Another barrier is that GenAI can only read text, so PDFs are a challenge. The files must first be converted to text via OCR libraries and in that process, some content is lost. In this exercise, we relied on open-source libraries, and we found that 10% of the documents could not be converted to text, while 8% of the converted documents presented minimal information. Fortunately, it is easy to identify the files not converted to text.

Furthermore, as part of that conversion, important visual material is excluded. Many startups depend on illustrations and infographics to portray their models; such visualizations are not yet easily “readable”. Instead, GenAI can only read text, even as pitch decks are usually PDFs full of charts and graphical elements.

Another challenge is guarding startup confidentiality. Some GenAI services use resources provided for analysis for training or even disclose it more publicly. All of the tools we used for this analysis relied on OpenAI API calls, which clearly define in their T&Cs that they do not use this information for training future models.

‍

Screening pitch decks at scale currently requires multiple tools and numerous steps, including some programming language to link the different parts

Given today's assortment of available tools, investors looking to screen or assess pitch decks (or any content) using GenAI must link a few tools together. We used OCR (Optical Character Recognition), OpenAI, Google Workspace, GPT for sheets, Python, and Bing API. Although following this approach is simple using coding, implementing it in sheets was challenging due to certain limitations in the GPT for sheets extension, which executed the API queries each time we modified the cells' content in the formulae.

First, we converted pitch decks to text so the LLM could “read” the written content. Although this feature is available in the User Interface of ChatGPT plus, the availability via API is limited to files smaller than 512 Mb, which was not enough for our purpose. We relied on open-source solutions for this, such as Tesseract.

Next, we needed a GenAI LLM (Large Language Model) to query the content of the decks. We used OpenAI API as the GenAI LLM provider. Its performance is considered state of the art; there is broad adoption and numerous online resources to aid usage (such as developer communities, libraries, and sample code in Git Hub). The price to assess 300 decks was considered reasonable (less than US$40). The OpenAI API, nevertheless, has limitations regarding requests per minute and tokens per minute. We had to space our requests over time to sidestep this limitation, although more sophisticated strategies are available.

We used the Google Workspace stack, including Google Colab to run the code, Google Drive to store the original decks, the text version post-conversion, and another set of files with the relevant parts of the decks selected using gen AI stored in JSON (JavaScript Object Notation), a very efficient format for data exchange and processing. We used "GPT for sheets" to create the categories for the variables. This extension provides a connector to the OpenAI API without requiring coding skills. Some training was needed to familiarize with the specific extension and its limitations, basic prompt engineering techniques used, and the need to split the training results from the evaluation results.

Third, an analytical solution was needed to orchestrate the different steps and automatically run the process for the 300 decks we considered. We selected Python as the script language to "glue" the various components of the solution.

Given the data we pulled from the decks, we wanted to add information from the web to supplement. We used the Bing API to retrieve results from the internet containing information not included in the decks, such as the founder's university and if the pitch was related to the founder's first venture.

Future work

One day, investors will likely be able to use GenAI to read and assess pitch decks, and some AI enthusiasts may even delegate the go/no-go decision to GPTs (we see some products specifically offering GenAI-enabled ToFu analysis, and a simpler example in this GPT).

However, the best use case we found, for now, is for GenAI to continuously analyze newly received decks to select those likely to be further considered and flat those that do not seem to fit our thesis. This initial flagging can allow analysts to more efficiently screen pitch decks, thereby preserving time to pursue other tasks like enhanced analysis.

Finally, GenAI could help develop intelligence about which startups are succeeding and which are not by allowing retrospective analysis on pitch decks. Such analysis would allow funds to identify potential early indicators of company success to inform selection and future assessment.

ANNEX: A How-To Guide to screen decks with GenAI

If you want to follow our approach, here are a few practical steps, recommendations, and examples:

Define the information you want to retrieve from the decks. In our case, we defined four key areas:

User types the companies were serving.
Problem/solution the company is addressing
Venture stage of the company
Founder characteristics: gender, country of origin, and university in which he has studied.

Choose your tools: Look at the tools you currently use in your company and identify ways to implement the solution with minimum investment. As mentioned, we chose the stack provided by Google Workspace, including drive, sheets, and collab, OpenAI API, and Bing API. We used Python, a scripting language that is easy to understand and broadly available.

Design a process: Think about a workflow that makes sense when using these tools so you can split the different phases of the analysis. In our case, we decided to follow the following process:

First, transform the pitch deck files into text using OCR. We stored PDFs (also ppts and some docsend) in one folder and text files in another folder, so we did not have to re-run the conversion once done. See the details in point 4.
After this, we extracted relevant information from the decks using OpenAI API. We stored this information in another folder using JSON files. See a detailed description in point 5 below.
Finally, we used complementary sources to retrieve the information required that was not available in the decks. This is explained in points 6 and 7.

As per the implementation in Python of this process, you can find a few bits and tips below.

Look for the right libraries. We used the following open source libraries:

These libraries enabled us to convert most of the PDFs to text. This process was slower than other faster libraries, which did not support OCR (Optical Character Recognition) conversion.

Structure your code. In our case, we created functions that can be called on each file and then created complementary functions to recursively run this in folders with similar content.

5.1) Transform pitch decks to text. This is an example of the function we created to run the transformation from pdf to text.

5.2) Ask the LLM to read the text version of the deck.

We created this function to use OpenAI API for document querying using gpt-3.5-turbo-16k and retrieving the information we wanted to analyze.

5.3) Craft a good prompt.

We extracted the relevant information by querying each of the decks with the same prompt. This prompt included all the text resulting from the pdf conversion, the request to the LLM to retrieve the information we were interested in (users they are serving, technology, solution), and the format in which we wanted the information back.

This is how we combined the content, the query, and the format in the prompt.

We refined the prompt until its performance met our purpose. This is the final prompt that was used.

5.4) Get the answer in a consistent format.

To easily manipulate the information in the following steps, we used JSON. We asked the LLM to provide the information in this format by including instructions in the prompt.

Below is an example of a JSON file extracted from one deck after querying the full-text document.

{"user_type": "Medium and large private farms, governments, AgriFood companies, NGOs and Climate Organizations, Fertilizer and irrigation companies",

"technology": "VIRTUAL FIELD PROBING (VFP) Technology, MULTI-MODAL, MULTI-SPECTRAL SATELLITE DATA STREAMS, AI/DL algorithms",

"problem": "Huge challenges facing the agriculture sector including water and fertilizer waste, yield loss from diseases, climate change, and water scarcity",

"solution": "Innovative technology that intelligently computes Ag field water levels and crop dynamics using data from satellites and proprietary AI/DL algorithms",

"stage": "Seed",

"country": "NA",

"users": "NA",

"revenue": "NA",

"f1_name": "Karim Amer",

"f2_name": "Mohamed ElHelw",

"f3_name": "NA"}

Fields like "revenue" and “number of users" were only completed for a few companies, as this information was often absent in the decks. For instance, 213 decks lacked revenue information, and 231 omitted user numbers.

After doing some tests, we decided to use the model gpt-3.5-turbo-16k. This model provided a large enough token window to process the full-text version of the deck and provided as good quality as gpt-4 at a fraction of the price. We also tried other approaches, such as summarization using langchain and vector conversion. Still, the results of the information extracted were poorer and less consistent depending on the deck size.

Adapt the information to the analysts' tools.

From the JSON file above, it's evident that we extracted most of the relevant information. However, this information must still be prepared for quantitative analysis as it must be fully structured. To accomplish this, we needed to categorize the text provided under 'user_type', 'technology', and 'solution' into predefined categories or tags. Analysts can do this task better, and as they do not use Python, we moved this part of the analysis to Google Sheets and the GPT for Sheet extension.

6.1) Move the information to Google Sheets

We loaded the information in a data frame and then moved that information to a Google sheet. We used This function to get all the JSONs in the data frame.

6.2) This is the function to load the data frame into Google Sheets the information based on your taxonomy.

7) Tag the information based on your taxonomy.

We utilized "GPT for Sheets" to tag user type, technology, and solution into predefined categories.

Although we could have categorized the information using OpenAI API calls similar to those for extracting relevant data, we opted for the 'GPT for Sheets' extension, which facilitates these API calls directly from a Google Sheets file. This method enables analysts without coding skills to modify categories, 're-train' the system with a few samples, and independently evaluate the outcomes.

The categories that the analysts defined were the following:

‍User_type: Farmers, households & individuals, fishers, gig economy, communities, small businesses, medium and large businesses, corporates, employers, urban consumers, youth, manufacturers, other.‍
Technology: AI, mobile app, biotech, desalination, digital advisory, digital credit, digital land rights, digital payments, digital platform, drip irrigation, green energy, hardware, IoT, IVR, risk analysis, satellite & drones, Solar, Blockchain, USSD, Other, none.‍
Solution: Biochar, Carbon, cold chain & food logistics, electric vehicles, Fintech, food systems, green agriculture, green construction, green energy, insect protein, Insurtech, land restoration, waste management, water.

We used the 'GPT_TAG()' function for each extracted description, supplying a list of categories and sufficient examples to ensure high-quality tagging. The criteria we decided to use to evaluate the quality of the different approaches were:

Reducing the number of user types categorized as 'other' to less than 3%.
Ensuring that over 90% of the tags selected by a human were also included in the system-generated tags.

The table below shows the aspect the Google sheet presented after structuring the information under the selected categories.

We iterated on a training set and evaluated against these criteria using a separate test set until achieving the desired values. The model utilized for this was "text- davinci-003". Though 'text-davinci-003' is deprecated as of 2024-01-04, OpenAI suggests 'gpt-3.5-turbo-instruct' as a replacement. This model has an even smaller cost per token than the one we used in this exercise.

8. Use Bing to retrieve additional information.

We utilized "Bing" to query founder information and OpenAI to verify the relevance of this information. We successfully retrieved information for at least
one co-founder in 91% of the decks. In 23 cases where the founder's information was not retrievable, ten decks did not include it. In most cases, the issue stemmed from poor PDF-to-text conversion, where the founder's name was part of an image. In only one instance, the founder's name was in the text file, and the LLM did not get it correctly.

Since the decks did not include information about gender and university, we employed a different approach to retrieve this data. We used the Bing API to perform a web search. We used the function below to query Bing.

We used the following string to achieve this.

This query generated relevant text from four web search results, from which we selected the first five. Below is an example of the text provided by this query for the founder:

"Juveline Ngum Ngwa" and the company "BleaGlee"

"Juveline Ngum Ngwa from BleaGlee in which university has studied?"

We then combined the four results of the search into a single text. We interrogated the combined text to find the data we aimed to retrieve: gender, university, entrepreneurial experience, and the university's country.

Again, we asked the LLM to provide the answer in JSON format.

As can be deduced from the prompt above, The LLM model, specifically gpt-3.5-turbo-16k, determined the founder's gender and the university's country based on its internal knowledge.

With this approach, we were able to identify the gender for 92.24% of the names, identify whether the founder was a first-time entrepreneur or not in 67.24% of the cases, and correctly retrieve the university for 63.36% of the founders researched. The LLM correctly assigned the country to the university's name in 95% of the cases.

Photo: AI image

Learn more about investing in climate tech innovation in Africa.

Share this post

Copy link

https:///insights/we-tried-using-genai-to-screen-pitch-decks-here-is-how-it-went

Investing