Hi Stuart,
As I mentioned in my earlier post, runnin filter-media with --force
(-f) switch didnt fix the problem.

-Mika

2009/6/16 Stuart Lewis <s.le...@auckland.ac.nz>:
> Hi Mika,
>
> Since running filter-media on new items seems OK, have you tried running:
>
> [dspace]/bin/filter-media -f
>
> -f forces all the bitstreams to be re-filtered.
>
> Thanks,
>
>
> Stuart Lewis
> Digital Services Programmer
> Te Tumu Herenga The University of Auckland Library
> Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
> Ph: 64 9 373-7599 x81928
> http://www.library.auckland.ac.nz/
>
>
>
> -----Original Message-----
> From: mikan.d.dspace listmail [mailto:mikan.dsp...@gmail.com]
> Sent: Tuesday, 16 June 2009 1:05 a.m.
> To: Terrance Davis
> Cc: Dspace Tech
> Subject: Re: [Dspace-tech] DSpace search weirdness
>
> Nope.
> The server 1 has Debian 5 with Java  version "1.6.0_12". and server 2
> has RHEL and Java version  "1.5.0_18". Could this cause the problem?
>
> Another strange thing I noticed, is that if I re-submit the entire
> item & file and then run filter-media, the text is extracted
> correctly?? So, to me  it seems that the old data in the transferred
> assetstore is handled incorrectly. Strange, eh?
>
> -Mika
>
>
>
>
> 2009/6/15 Terrance Davis <terrance.da...@utah.edu>:
>> Hi Mika,
>>
>> Are both systems using the same OS version and the same version of Java?
>>
>> Best regards,
>>
>> Terrance
>>
>> --
>> Web Applications Programmer
>> Institute for Clean and Secure Energy
>> University of Utah
>> http://www.ices.utah.edu
>>
>>
>> On Jun 15, 2009, at 2:01 AM, mikan.d.dspace listmail wrote:
>>
>>> Hi Terrance,
>>>
>>> I double-checked the indexes in configuration and they do match. What
>>> I noticed though, is that the text extracted from pdf files differ,
>>> which might be the cause of this problem. It seems that when
>>> filter-media extracts the text on the other server, it messes up some
>>> special characters, thus making them unsearchable. What might be
>>> causing  this? Both databases are set to UNICODE when created. Is
>>> there some other system setting that might be causing this?
>>>
>>> Example of extracted text is below:
>>>
>>> Server 1: (correct encoding)
>>> 3. PUNAISEN KIRJAN SISÄLTÖ
>>> Jaettiin punaisen kirjan sisällön päivitystä varten vastuuhenkilöt
>>> seuraavaksi:
>>> 3.1 Yleisasu ja kirjan sisällön järjestys miettii ja tarkastelee Tiina
>>> Sairanen
>>>
>>> Server 2: (Messed up characters)
>>>
>>> 3. PUNAISEN KIRJAN SIS?LT?
>>> Jaettiin punaisen kirjan sis?ll?n p?ivityst? varten vastuuhenkil?t
>>> seuraavaksi:
>>> 3.1 Yleisasu ja kirjan sis?ll?n j?rjestys miettii ja tarkastelee Tiina
>>> Sairanen
>>>
>>>
>>> Thanks for any help,
>>> Mika
>>>
>>>
>>> 2009/6/12 Terrance Davis <terrance.da...@utah.edu>:
>>>>
>>>> Hi Mika,
>>>> My first guess is that your config files don't match. You might want to
>>>> check the server that is returning 40 results. If the configured search
>>>> indexes have any white space (such as a tab) after the properties, they
>>>> might not be matching up with the dublin core and not indexing properly.
>>>> No trim() is happening on the configured search index properties from the
>>>> 1.5.2 dspace.cfg, so they may look the same, but be thrown off by extra
>>>> unwanted white space.
>>>> Best regards,
>>>> Terrance Davis
>>>> --
>>>> Web Applications Programmer
>>>> Institute for Clean and Secure Energy
>>>> University of Utah
>>>> http://www.ices.utah.edu/
>>>>
>>>>
>>>>
>>>> On Jun 12, 2009, at 5:24 AM, mikan.d.dspace listmail wrote:
>>>>
>>>> Im confused by the way DSpace search works. I cloned our Dspace 1.5.2
>>>> instance to another server. They both have the same config, same items
>>>> etc. However when I run search I get different results?! With the same
>>>> search term the other search shows 40 results and the other 72. I've
>>>> forced reindexing and media-filters but nothing changes. What could be
>>>> the  cause of this?
>>>>
>>>> Thanks,
>>>> Mika
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Crystal Reports - New Free Runtime and 30 Day Trial
>>>> Check out the new simplified licensing option that enables unlimited
>>>> royalty-free distribution of the report engine for externally facing
>>>> server and web deployment.
>>>> http://p.sf.net/sfu/businessobjects
>>>> _______________________________________________
>>>> DSpace-tech mailing list
>>>> DSpace-tech@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>>>
>>>>
>>
>>
>
> ------------------------------------------------------------------------------
> Crystal Reports - New Free Runtime and 30 Day Trial
> Check out the new simplified licensing option that enables unlimited
> royalty-free distribution of the report engine for externally facing
> server and web deployment.
> http://p.sf.net/sfu/businessobjects
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to