Re: Combine Data from PDF + XML

2016-10-26 Thread Erick Erickson
In that case you'll have to write an indexing client that (probably)
uses Tika to parse the PDF file, some kind of XML parser to parse the
metadata XML and combine the two into Solr documents that you send to
Solr. Here's a skeletal program with some extra stuff in there for
database connectivity, but you should be able to chop that out pretty
easily.

https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/

Best,
Erick


On Wed, Oct 26, 2016 at 1:47 PM, tesm...@gmail.com  wrote:
> Hi Erick,
>
> Thanks for your reply.
>
> Yes, XML files contain metadata about PDF files. I need to search from both
> XML and PDF files and to show search results from both sources.
>
>
> Regards,
>
> On Wed, Oct 26, 2016 at 1:47 AM, Erick Erickson 
> wrote:
>
>> First you need to define the problem
>>
>> what do you mean by "combine"? Do the XML files
>> contain, say, metadata about an associated PDF file?
>>
>> Or are these entirely orthogonal documents that
>> you need to index into the same collection?
>>
>> Best,
>> Erick
>>
>> On Tue, Oct 25, 2016 at 4:18 PM, tesm...@gmail.com 
>> wrote:
>> > Hi,
>> >
>> > I ma new to Apache Solr.  Developing a search project. The source data is
>> > coming from two sources:
>> >
>> > 1) XML Files
>> >
>> > 2) PDF Files
>> >
>> >
>> > I need to combine these two sources for search.  Couldn't find example of
>> > combining these two sources. Any help is appreciated.
>> >
>> >
>> > Regards,
>>


Re: Combine Data from PDF + XML

2016-10-26 Thread tesm...@gmail.com
Hi Erick,

Thanks for your reply.

Yes, XML files contain metadata about PDF files. I need to search from both
XML and PDF files and to show search results from both sources.


Regards,

On Wed, Oct 26, 2016 at 1:47 AM, Erick Erickson 
wrote:

> First you need to define the problem
>
> what do you mean by "combine"? Do the XML files
> contain, say, metadata about an associated PDF file?
>
> Or are these entirely orthogonal documents that
> you need to index into the same collection?
>
> Best,
> Erick
>
> On Tue, Oct 25, 2016 at 4:18 PM, tesm...@gmail.com 
> wrote:
> > Hi,
> >
> > I ma new to Apache Solr.  Developing a search project. The source data is
> > coming from two sources:
> >
> > 1) XML Files
> >
> > 2) PDF Files
> >
> >
> > I need to combine these two sources for search.  Couldn't find example of
> > combining these two sources. Any help is appreciated.
> >
> >
> > Regards,
>


Re: Combine Data from PDF + XML

2016-10-25 Thread Erick Erickson
First you need to define the problem

what do you mean by "combine"? Do the XML files
contain, say, metadata about an associated PDF file?

Or are these entirely orthogonal documents that
you need to index into the same collection?

Best,
Erick

On Tue, Oct 25, 2016 at 4:18 PM, tesm...@gmail.com  wrote:
> Hi,
>
> I ma new to Apache Solr.  Developing a search project. The source data is
> coming from two sources:
>
> 1) XML Files
>
> 2) PDF Files
>
>
> I need to combine these two sources for search.  Couldn't find example of
> combining these two sources. Any help is appreciated.
>
>
> Regards,


Combine Data from PDF + XML

2016-10-25 Thread tesm...@gmail.com
Hi,

I ma new to Apache Solr.  Developing a search project. The source data is
coming from two sources:

1) XML Files

2) PDF Files


I need to combine these two sources for search.  Couldn't find example of
combining these two sources. Any help is appreciated.


Regards,