Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Vijaykumar Jain
http://tika.apache.org/ To get started with collecting doc metadata. It looks this tool can help you started. postgres does support fuzzy text search, so I do think dumping meta data /abstract in postgresql and then using trigram tsearch etc like extensions it should work well for a POC. this bein

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver
On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in various formats (text, pdf, etc) and can store metadata for this paper ,some kind of global ID if applicable, authors, areas of research, whether the pa

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in various formats (text, pdf, etc) and can store metadata for this paper ,some kind of global ID if app

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Στις 5/6/21 4:45 μ.μ., ο/η Vijaykumar Jain έγραψε: http://tika.apache.org/ I checked, it behaves better with downloaded PDF rather than URL PDFs, in the 2nd case the metadata are poor. Does not work with nih articles (but this is general problem not tika's ) To get

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver
On 6/5/21 9:56 AM, Achilleas Mantzios wrote: Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in various formats (text, pdf, etc) and can store metadata

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 9:56 AM, Achilleas Mantzios wrote: Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in vario

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver
On 6/5/21 10:39 AM, Achilleas Mantzios wrote: Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 9:56 AM, Achilleas Mantzios wrote: Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from

Re: strange behavior of WAL files

2021-06-05 Thread Tom Lane
Atul Kumar writes: > Please check my findings below > older > -rw--- 1 enterprisedb enterprisedb 16777216 Jun 2 02:47 > 000136CF00A4 > -rw--- 1 enterprisedb enterprisedb 16777216 Jun 2 02:45 > 000136CF00A3 > -rw--- 1 enterprisedb enterprisedb 16777216 Jun 2

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Στις 5/6/21 10:12 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 10:39 AM, Achilleas Mantzios wrote: Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 9:56 AM, Achilleas Mantzios wrote: Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε: On 6/5/21 2:49 AM, Achilleas Mantzios wrote: Hell

Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in various formats (text, pdf, etc) and can store metadata for this paper ,some kind of global ID if applicable, authors, areas of research, whether the paper is "new", "highlighted", "historical", ty

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Laura Smith
‐‐‐ Original Message ‐‐‐ On Saturday, 5 June 2021 10:49, Achilleas Mantzios wrote: > Hello > > I am imagining a system that can parse papers from various sources > (web/files/etc) and in various formats (text, pdf, etc) and can store > metadata for this paper ,some kind of global ID if

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios
Στις 5/6/21 1:52 μ.μ., ο/η Laura Smith έγραψε: ‐‐‐ Original Message ‐‐‐ On Saturday, 5 June 2021 10:49, Achilleas Mantzios wrote: Hello I am imagining a system that can parse papers from various sources (web/files/etc) and in various formats (text, pdf, etc) and can store metadata f

Aw: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Karsten Hilbert
> I am imagining a system that can parse papers from various sources > (web/files/etc) and in various formats (text, pdf, etc) and can store > metadata for this paper ,some kind of global ID if applicable, authors, > areas of research, whether the paper is "new", "highlighted", > "historical", type

Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Laura Smith
Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Saturday, 5 June 2021 12:14, Achilleas Mantzios wrote: > > I know its a huge work, but you are missing a point. Nobody wishes to > compete with anyone. This is a about a project, a parent-advocacy > non-profit that ONLY