subject:"Re\: Document chunking"

Re: Document chunking

2024-04-09 Thread Tim Allison

My 0.02... 1) It is important that we do what we can to make it easy for people to integrate Tika into the dense vector/llm/rag landscape. I see A LOT of projects reinventing the wheel (without multi-parser full recursion like we have), or just running pdftotext and declaring victory. So, if we ca

Re: Document chunking

2024-04-09 Thread Eric Pugh

Your approach sounds great as well Nick…. > On Apr 9, 2024, at 2:21 AM, Michael Wechner wrote: > > Thanks for sharing your approach! > > Do you already have some code to share? > > Today I read about https://github.com/infiniflow/ragflow which might also > have some interesting chunking ap

Re: Document chunking

2024-04-09 Thread Michael Wechner

Thanks for sharing your approach! Do you already have some code to share? Today I read about https://github.com/infiniflow/ragflow which might also have some interesting chunking approaches. Thanks Michael Am 09.04.24 um 01:25 schrieb Nick Burch: On Mon, 8 Apr 2024, Tim Allison wrote: Not

Re: Document chunking

2024-04-08 Thread Nick Burch

On Mon, 8 Apr 2024, Tim Allison wrote: Not sure we should jump on the bandwagon, but anything we can do to support smart chunking would benefit us. Could just be more integrations with parsers that turn out to be useful. I haven’t had much joy with some. Here’s one that I haven’t evaluated yet:

Re: Document chunking

2024-04-08 Thread Nicholas DiPiazza

I am also very interested in this vector-based search. Indexes are a big thing right now. On Mon, Apr 8, 2024, 4:16 PM Michael Wechner wrote: > It would be great to have good "semantic chunking" in order to generate > vector embeddings. > > Thanks for the link below, will try to test it. > > Tha

Re: Document chunking

2024-04-08 Thread Michael Wechner

It would be great to have good "semantic chunking" in order to generate vector embeddings. Thanks for the link below, will try to test it. Thanks Michael Am 08.04.24 um 18:29 schrieb Tim Allison: Not sure we should jump on the bandwagon, but anything we can do to support smart chunking wou

Re: Document chunking

Re: Document chunking

Re: Document chunking

Re: Document chunking

Re: Document chunking

Re: Document chunking

6 matches

Site Navigation

Mail list logo

Footer information