blogProfessional Intuition + AI = Better, Faster Patent Searches

Patents have long had a searchability problem. Between classification systems, prior art, and the tens of millions of patents on file, starting a comprehensive patent search is like assembling the first corner of an epic 1000-piece jigsaw puzzle. Sure, there is that sense of discovery, but it can also be overwhelming, not to mention time-consuming. Technology and especially artificial intelligence have long promised to reduce the burden of patent searching but in practice it isn't that simple.

AI tools have a problem

Why do people search patents? I don’t mean the legal or business reasons. I mean, what motivates people to read through hundreds of patents? From my own experience and what I’ve heard from customers, a big part is the exciting challenge of figuring out the right search logic and the joy of learning new things. But the vast majority of search time is spent weeding through dense, jargon-filled patents,  trying to find the proverbial needle in the haystack. That’s not something anyone enjoys. Even patent nerds like me. We built Amplified to simplify patent searching. When we originally set out to do that in 2017, we thought, surely patent analysts would welcome any tool to make their work easier. But static AI results actually remove the only enjoyable part of searching. With no queries and no iteration, reading a list of results is just a slog. Sure, if you happen to get something great right away, that’s helpful. In practice, though, technology and patents are highly subjective, so the odds of getting the right handful of results in a short list of patents out of over 100 million is, understandably, low.

How humans and machines work together with Amplified

Many existing solutions leverage the human knowledge already built into patent data to help increase the odds that the machine finds relevant results. For example, you can predict class codes using text, citation networks, or extract and prioritize rare words. This process can be very effective and indeed professional patent researchers use many of these techniques to great effect. The difference is that humans have strong intuition about which techniques to use in which situations while machines do not. As a result, machines that use hard-coded search tricks are very hit or miss. Sometimes the results are excellent and sometimes entirely off-base. To understand and try to solve the problem of reliably and repeatably finding relevant results, we calculated the odds of finding an X reference using only text.  In other words, could we teach an AI model to accurately understand conceptual similarity using full text as-is rather than through shortcuts like sampled keywords, inferred class codes, or citation networks? We compared raw AI results in 1.5 million PCT applications to their corresponding search reports using only the text from the application’s description. As it turns out, the odds of the AI independently finding an X reference right away are roughly one in four. That’s about 100,000 times better than chance and a pretty good baseline. But for a trained analyst who knows the field, it is frustrating to have to read through a static list of results in addition to the search work they would be doing anyway. There has to be a way to iterate through the results. Now that we had an AI-based understanding of document-to-document similarity, what if you add just a tiny bit of human knowledge to the AI search? For example, by combining a critical keyword or an important class code. The odds jump up to 75%. That’s an X reference found for three out of four international search reports in just two clicks! This is the powerful partnership between human and machine that we hope to amplify. One of the necessary components in solving the problem: rich, accurate and up-to-date patent data. Amplified’s custom-built patent language model is trained on the text and metadata from the IFI CLAIMS collection.

Technology should facilitate, not dictate

The bottom line: Searching is iterative and the tools we use must be as well. Whether Boolean-based or AI or something else, great tools allow users to efficiently build knowledge by rapidly iterating between searching for information and reading to learn from that information. At Amplified, this led us to a unique solution that is powerful for experienced searchers and intuitive for newbies. Instead of a static list of AI results to read through, we split the search logic into two independent ways of organizing information: sorting and filtering. Sorting uses patents or text to define what you’re looking for in broad conceptual terms. Amplified uses this to sort the entire database. This part is done through our pre-trained AI, which is extraordinarily good at getting to the right neighborhood instantly.  Filtering uses traditional search techniques like Boolean logic in conjunction with the sorted results. When you reduce the total pool, the odds of finding the right information in the top 20 results improve exponentially. More importantly, your ability to learn is much faster. Since the results are automatically sorted by AI-based similarity, you will, by definition, be reviewing the least amount of noise.

Human intuition aided by machine scale

Let me share a specific example of the human-AI partnership. Often, new Amplified users with patent search experience spend time writing and re-writing invention descriptions in their attempt to find results that their intuition tells them should be there. It’s as if the AI tool creates some pressure to search the “AI way.” If your intuition as an expert searcher means you can quickly identify relevant patents through a few simple keywords searching against title, abstract, and claims, then do that! But then, take those results and give them to the AI to find even more. You can re-sort and improve Amplified by using relevant results. In this way, you can leverage both your unique knowledge and the AI’s unique way of understanding patents at massive scale.

AI learns, but so can users

Here’s the key: iterating has to be understandable to the person searching. Just clicking a button to re-train AI doesn’t work. It’s too hard to understand what it’s doing, so it just feels like another random list. The AI might be learning but there’s no learning happening for you, the user. The power of combining keywords with AI sorting is in the way it allows you to iterate and learn faster. We’re still on a mission to continually improve our core AI and extend its capabilities to tasks beyond searching, such as custom classification. Our guiding light, however, is building intuitive technology that amplifies people’s knowledge. We might not have known all of this at the beginning, but one thing is for sure — we picked the right name! Searching is the focus of this particular post, but I think the principles are the same for any kind of knowledge-intensive work. What do you think? How can technology be better used to empower IP professionals? Follow us on LinkedIn to share your thoughts and join the conversation. 

This post was authored by Sam Davis and originally featured in IFI Claims blog on April 26th, 2022