Hi,
one more thing. Can you please post me the first few lines of
{indexing-source}/indexing/resource/incoming_links.txt
so that I can check the data against the configuration of the
iditerator.properties file
best
Rupert
On Thu, Aug 23, 2012 at 10:31 PM, Rupert Westenthaler
<[email protected]> wrote:
> Hi
>
> The log shows clearly that you only import the triples from the dumps
> to the Jena TDB triple store used as Source for the indexing.
>
> See all the lines such as
>
> 8:14:08,196 [Thread-5] INFO tdb.loader - Add: 50,000 triples
> (Batch: 3,256 / Avg: 3,256)
> 08:14:12,802 [Thread-5] INFO tdb.loader - Add: 100,000 triples
> (Batch: 10,855 / Avg: 5,009)
>
> BTW: this needs only to be done once. After this initialization step
> completes you can remove the RDF files from
> "{indexing-root}/indexing/resources/rdfdata/" (I usually just rename
> the rdfdata folder to imported-rdfdata).
>
> The ~1.5hrs are just the time needed to import the data from the RDF
> dumps to the Jena TDB store.
>
> With
>
> 08:18:04,242 [main] INFO impl.IndexerImpl - Indexing started ...
>
> the indexing starts and
>
> 08:21:03,176 [Indexing: Finished Entity Logger Deamon] INFO
> impl.IndexerImpl - Indexed 0 items in 1410320sec (Infinityms/item):
> processing: -1.000ms/item | queue: -1.000ms
>
> states clearly that no single Entity was indexed.
>
> I guess this has to do with the configuration. I will have a look at
> it tomorrow morning.
>
> best
> Rupert
>
> On Thu, Aug 23, 2012 at 9:53 PM, harish suvarna <[email protected]> wrote:
>> I am attaching the zip of config folder. The indexing takes quiet some time
>> (~1.5hrs). The number of triples it generates is high.
>> I am attaching the english indexing output also. I used 10 files (except
>> long_abstarcts_en.nt, it is 2.5 GB and I could not save it in utf8 on my
>> mac.). But for Chinese I had all files.
>> -harish
>>
>>
>> On Thu, Aug 23, 2012 at 12:27 PM, Rupert Westenthaler
>> <[email protected]> wrote:
>>>
>>> I would expect the dbpedia.solrindex.zip file to be several hundreds
>>> MByte in size (if not gigabytes).
>>>
>>> The only explanation for this file to be so small is that something is
>>> going wrong during indexing.
>>>
>>> Can you maybe provide the {indexing-root}/indexing/config folder so
>>> that I can have a look at your configuration
>>>
>>> best
>>> Rupert
>>>
>>> On Thu, Aug 23, 2012 at 5:49 PM, harish suvarna <[email protected]>
>>> wrote:
>>> >
>>> > Rupert,
>>> > I generated the index for dbpedia3.8 English files only.
>>> > One thing that intrigues me is that the dbpedia.solrindex.zip filesize
>>> > is
>>> > 53kb, same when I generated for chinese. The english files are much
>>> > bigger.
>>> > In the english zip also, I can't find paris.
>>> > I am attaching English dbpedia.solrindex.zip for any clues.
>>> > Do I need to load the bundle jar file created by the dbpedia indexing?
>>> >
>>> > -harish
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler [email protected]
>>> | Bodenlehenstraße 11 ++43-699-11108907
>>> | A-5500 Bischofshofen
>>
>>
>>
>>
>> --
>> Thanks
>> Harish
>>
>
>
>
> --
> | Rupert Westenthaler [email protected]
> | Bodenlehenstraße 11 ++43-699-11108907
> | A-5500 Bischofshofen
--
| Rupert Westenthaler [email protected]
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen