Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Andy
> > 1) hyphens - if user types "ema" or "e-ma" I want to
> > suggest "email"
> > 
> > 2) accents - if user types "herme"  want to suggest
> > "Hermès"
> 
> Accents can be removed with using MappingCharFilterFactory
> before the tokenizer. (both index and query time)
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
> I am not sure if this is most elegant solution but you can
> replace - with "" uing MappingCharFilterFactory too. It
> satisfies what you describe in 1.
> 
> But generally NGramFilterFactory produces a lot of tokens.
> I mean query er can return hermes. May be
> EdgeNGramFilterFactory can be more suitable for
> auto-complete task. At least it guarantees that some word is
> starting with that character sequence.

Thanks.

I agree with the issues with NGramFilterFactory you pointed out and I really 
want to avoid using it. But the problem is that I have Chinese tags like "电吉他" 
and multi-lingual tags like "electric吉他".

For tags like that WhitespaceTokenizerFactory wouldn't work. And if I use 
ChineseFilterFactory would it recognize that the "electric" in "electric吉他" 
isn't Chinese and shouldn't be split into individual characters?

Any ideas here are greatly appreciated.

In a related matter, I checked out 
http://lucene.apache.org/solr/api/org/apache/solr/analysis/package-tree.html 
and saw that there are:

EdgeNGramFilterFactory & EdgeNGramTokenizerFactory
NGramFilterFactory & NGramTokenizerFactory

What are the differences between *FilterFactory and *TokenizerFactory? In my 
case which one should I be using?

Thanks.





Re: multi level faceting

2010-10-04 Thread Allistair Crossley
I think that is just sending 2 fq facet queries through. In Solr PHP I would do 
that with, e.g.

$params['facet'] = true;
$params['facet.fields'] = array('Size');
$params['fq'] => array('sex' => array('Men', 'Women'));

but yes i think you'd have to send through what the current facet query is and 
add it to your next drill-down

On Oct 4, 2010, at 9:36 AM, Nguyen, Vincent (CDC/OD/OADS) (CTR) wrote:

> Hi,
> 
> 
> 
> I was wondering if there's a way to display facet options based on
> previous facet values.  For example, I've seen many shopping sites where
> a user can facet by "Mens" or "Womens" apparel, then be shown "sizes" to
> facet by (for Men or Women only - whichever they chose).  
> 
> 
> 
> Is this something that would have to be handled at the application
> level?
> 
> 
> 
> Vincent Vu Nguyen
> 
> 
> 
> 
> 



Re: Prioritizing advectives in solr search

2010-10-04 Thread Otis Gospodnetic
Hi Hasnain,

You'll need to apply POS (Part of Speech) on the input at/before indexing, then 
store a payload with your adjective terms, and finally use of those payload 
values to change the scoring at query time.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Hasnain 
> To: solr-user@lucene.apache.org
> Sent: Fri, October 1, 2010 3:15:54 AM
> Subject: Prioritizing advectives in solr search
> 
> 
> Hi,
> 
>My question is related to search results giving less  importance to
> adjectives, 
> 
> here is my scenario, im using dismax  handler and my understanding is when I
> query "Blue hammer", solr brings me  results for "blue hammer", "blue" and
> "hammer", and in the same hierarchy,  which is understandable, is there any
> way I can manage the "blue" keyword, so  that solr searches for "blue hammer"
> and "hammer" and not any results for  "blue".
> 
> my handler is as follows...
> 
>  
> 
>  
>  dismax
>explicit
>  0.6
>  name^2.3  mat_nr^0.4
> 0% 
> 
> any suggestion on this??
> -- 
> View this message in context: 
>http://lucene.472066.n3.nabble.com/Prioritizing-advectives-in-solr-search-tp1613029p1613029.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
> 


RE: DIH sub-entity not indexing

2010-10-04 Thread Ephraim Ofir
The closest you can get to debugging (without actually debugging...) is
to look at the logs and use
http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mo
de

Ephraim Ofir


-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
Sent: Monday, October 04, 2010 3:09 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH sub-entity not indexing

Thanks Ephraim. I tried your suggestion with the ID but capitalising it
did not work. 

Indeed, I have a column that already works using a lower-case id. I wish
I could debug it somehow - see the SQL? Something particular about this
config it is not liking.

I read the post you linked to. This is more a performance-related thing
for him. I would be happy just to see low performance and my contacts
populated right now!! :D

Thanks again

On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:

> Make sure you're not running into a case sensitivity problem, some
stuff
> in DIH is case sensitive (and some stuff gets capitalized by the
jdbc).
> Try using listing.ID instead of listing.id.
> On a side note, if you're using mysql, you might want to look at the
> CONCAT_WS function.
> You might also want to look into a different approach than
sub-entities
> -
>
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
>
c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
> %3E
> 
> Ephraim Ofir
> 
> -Original Message-
> From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 2:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> I have tried a more elaborate join also following the features example
> of the DIH example but same result - SQL works fine directly but Solr
is
> not indexing the array of full_names per Listing, e.g.
> 
> 
> 
>   query="select * from listing_contacts where
> listing_id = '${listing.id}'">
>   query="select concat(first_name,
> concat(' ', last_name)) as full_name from contacts where id =
> '${listing_contact.contact_id}'">
>   
>   
>
> 
> 
> 
> Am I missing the obvious?
> 
> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
> 
>> Hello list,
>> 
>> I've been successful with DIH to a large extent but a seemingly
simple
> extra column I need is posing problems. In a nutshell I have 2
entities
> let's say - Listing habtm Contact. I have copied the relevant parts of
> the configs below.
>> 
>> I have run my SQL for the sub-entity Contact and this is produces
> correct results. No errors are given by Solr on running the import.
Yet
> no records are being set with the contacts array.
>> 
>> I have taken out my sub-entity config and replaced it with a simple
> template value just to check and values then come through OK.
>> 
>> So it certainly seems limited to my query or query config somehow. I
> followed roughly the example of the DIH bundled example.
>> 
>> DIH.xml
>> ===
>> 
>> 
>> ...
>> > query="select concat(c.first_name, concat(' ', c.last_name)) as
> full_name from contacts c inner join listing_contacts lc on c.id =
> lc.contact_id where lc.listing_id = '${listing.id}'">
>> 
>> 
>> 
>> SCHEMA.XML
>> 
>>  multiValued="true" required="false" />
>> 
>> 
>> Any tips appreciated.
> 



Re: Prioritizing advectives in solr search

2010-10-04 Thread Hasnain

Hi Otis,

 Thank you for replying, unfortunately Im unable to fully grasp what
you are trying to say, can you please elaborate what is payload with
adjective terms?

also Im using stopwords.txt to stop adjectives, adverbs and verbs, now when
I search for "Blue hammers", solr searches for "blue hammers" and "hammers"
but not "blue", but the problem here is user can also search for just
"Blue", then it wont search for anything...

any suggestions on this?? 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Prioritizing-adjectives-in-solr-search-tp1613029p1629725.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple masters and replication between masters?

2010-10-04 Thread Upayavira
On Mon, 2010-10-04 at 00:25 +0530, Arunkumar Ayyavu wrote: 
> I'm looking at setting up multiple masters for redundancy (for index
> updates). I found the thread in this link
> (http://www.lucidimagination.com/search/document/68ac303ce8425506/multiple_masters_solr_replication_1_4)
> discussed this subject more than a year back. Does Solr support such
> configuration today?

Solr does not support master/master replication. When you commit
documents to SOLR, it adds a segment to the underlying Lucene index.
Replication then syncs that segment to your slaves. To do master/master
replication, you would have to pull changes from each master, then merge
those changed segments into a single updated index. This is more complex
than what is happening in the current Solr replication (which is not
much more than an rsync of the index files).

Note, if you commit your changes to two masters, you cannot switch a
slave between them, as it is unlikely that the two masters will have
matching index files. If you did so, you would probably trigger a pull
of the entire index across the network, which (while it would likely
work) would not be the most efficient action.

What you can do is think cleverly about how you organise your
master/slave setup. E.g. have a slave that doesn't get queried, but
exists to take over the role of the master in case it fails. The index
on a slave is the same as that in a master, and can immediately take on
the role of the master (receiving commits), and upon failure of your
master, you could point your other slaves at this new master, and things
should just carry on as before.

Also, if you have a lot of slaves (such that they are placing too big a
load on your master), insert intermediate hosts that are both slaves off
the master, and masters to your query slaves. That way, you could have,
say, two boxes slaving off the master, then 20 or 30 slaving off them.

> And does Solr support replication between masters? Otherwise, I'll
> have to post the updates to all masters to keep the indexes of masters
> in sync. Does SolrCloud address this case? (Please note it is too
> early for me to read about SolrCloud as I'm still learning Solr)

I don't believe SolrCloud is aiming to support master/master
replication.

HTH

Upayavira




Re: DIH sub-entity not indexing

2010-10-04 Thread Allistair Crossley
Hey,

Yes that tool doesn't work too well for me. I can load it up and get the forms 
on the left, but when I run a debug the right hand side tells me that the page 
is not found. I *think* this is because I use a custom query string parameter 
in my DIH XML for use with delta querying and this being missing is failing the 
tool and it doesn't support adding custom query string params.

Cheers, Allistair

On Oct 4, 2010, at 9:20 AM, Ephraim Ofir wrote:

> The closest you can get to debugging (without actually debugging...) is
> to look at the logs and use
> http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mo
> de
> 
> Ephraim Ofir
> 
> 
> -Original Message-
> From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 3:09 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> Thanks Ephraim. I tried your suggestion with the ID but capitalising it
> did not work. 
> 
> Indeed, I have a column that already works using a lower-case id. I wish
> I could debug it somehow - see the SQL? Something particular about this
> config it is not liking.
> 
> I read the post you linked to. This is more a performance-related thing
> for him. I would be happy just to see low performance and my contacts
> populated right now!! :D
> 
> Thanks again
> 
> On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:
> 
>> Make sure you're not running into a case sensitivity problem, some
> stuff
>> in DIH is case sensitive (and some stuff gets capitalized by the
> jdbc).
>> Try using listing.ID instead of listing.id.
>> On a side note, if you're using mysql, you might want to look at the
>> CONCAT_WS function.
>> You might also want to look into a different approach than
> sub-entities
>> -
>> 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
>> 
> c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
>> %3E
>> 
>> Ephraim Ofir
>> 
>> -Original Message-
>> From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
>> Sent: Monday, October 04, 2010 2:49 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: DIH sub-entity not indexing
>> 
>> I have tried a more elaborate join also following the features example
>> of the DIH example but same result - SQL works fine directly but Solr
> is
>> not indexing the array of full_names per Listing, e.g.
>> 
>> 
>> 
>>  >   query="select * from listing_contacts where
>> listing_id = '${listing.id}'">
>>   >  query="select concat(first_name,
>> concat(' ', last_name)) as full_name from contacts where id =
>> '${listing_contact.contact_id}'">
>>  
>>  
>>   
>> 
>> 
>> 
>> Am I missing the obvious?
>> 
>> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
>> 
>>> Hello list,
>>> 
>>> I've been successful with DIH to a large extent but a seemingly
> simple
>> extra column I need is posing problems. In a nutshell I have 2
> entities
>> let's say - Listing habtm Contact. I have copied the relevant parts of
>> the configs below.
>>> 
>>> I have run my SQL for the sub-entity Contact and this is produces
>> correct results. No errors are given by Solr on running the import.
> Yet
>> no records are being set with the contacts array.
>>> 
>>> I have taken out my sub-entity config and replaced it with a simple
>> template value just to check and values then come through OK.
>>> 
>>> So it certainly seems limited to my query or query config somehow. I
>> followed roughly the example of the DIH bundled example.
>>> 
>>> DIH.xml
>>> ===
>>> 
>>> 
>>> ...
>>> >> query="select concat(c.first_name, concat(' ', c.last_name)) as
>> full_name from contacts c inner join listing_contacts lc on c.id =
>> lc.contact_id where lc.listing_id = '${listing.id}'">
>>> 
>>> 
>>> 
>>> SCHEMA.XML
>>> 
>>> > multiValued="true" required="false" />
>>> 
>>> 
>>> Any tips appreciated.
>> 
> 



RE: DIH sub-entity not indexing

2010-10-04 Thread Ephraim Ofir
Make sure you're not running into a case sensitivity problem, some stuff
in DIH is case sensitive (and some stuff gets capitalized by the jdbc).
Try using listing.ID instead of listing.id.
On a side note, if you're using mysql, you might want to look at the
CONCAT_WS function.
You might also want to look into a different approach than sub-entities
-
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
%3E

Ephraim Ofir

-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
Sent: Monday, October 04, 2010 2:49 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH sub-entity not indexing

I have tried a more elaborate join also following the features example
of the DIH example but same result - SQL works fine directly but Solr is
not indexing the array of full_names per Listing, e.g.











Am I missing the obvious?

On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:

> Hello list,
> 
> I've been successful with DIH to a large extent but a seemingly simple
extra column I need is posing problems. In a nutshell I have 2 entities
let's say - Listing habtm Contact. I have copied the relevant parts of
the configs below.
> 
> I have run my SQL for the sub-entity Contact and this is produces
correct results. No errors are given by Solr on running the import. Yet
no records are being set with the contacts array.
> 
> I have taken out my sub-entity config and replaced it with a simple
template value just to check and values then come through OK.
> 
> So it certainly seems limited to my query or query config somehow. I
followed roughly the example of the DIH bundled example.
> 
> DIH.xml
> ===
> 
> 
>  ...
>   query="select concat(c.first_name, concat(' ', c.last_name)) as
full_name from contacts c inner join listing_contacts lc on c.id =
lc.contact_id where lc.listing_id = '${listing.id}'">
> 
> 
> 
> SCHEMA.XML
> 
> 
> 
> 
> Any tips appreciated.



multi level faceting

2010-10-04 Thread Nguyen, Vincent (CDC/OD/OADS) (CTR)
Hi,

 

I was wondering if there's a way to display facet options based on
previous facet values.  For example, I've seen many shopping sites where
a user can facet by "Mens" or "Womens" apparel, then be shown "sizes" to
facet by (for Men or Women only - whichever they chose).  

 

Is this something that would have to be handled at the application
level?

 

Vincent Vu Nguyen



 



Re: DIH sub-entity not indexing

2010-10-04 Thread Stefan Matheis
Allistair,

Indeed, I have a column that already works using a lower-case id. I wish
> I could debug it somehow - see the SQL? Something particular about this
> config it is not liking.
>

you may want to try the MySQL Query-Log, to check which Queries are
performed?
http://dev.mysql.com/doc/refman/5.1/en/query-log.html


Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Andy
 
> I got your point. You want to retrieve "electric吉他"
> with the query 吉他. That's why you don't want EdgeNGram.
> If this is the only reason for NGram, I think you can
> transform "electric吉他" into two tokens "electric"
> "吉他" in TokenFilter(s) and apply EdgeNGram approach.
> 

What TokenFilters would split "electric吉他" into "electric" & "吉他"?





Re: DIH sub-entity not indexing

2010-10-04 Thread Allistair Crossley
Thanks Ephraim. I tried your suggestion with the ID but capitalising it did not 
work. 

Indeed, I have a column that already works using a lower-case id. I wish I 
could debug it somehow - see the SQL? Something particular about this config it 
is not liking.

I read the post you linked to. This is more a performance-related thing for 
him. I would be happy just to see low performance and my contacts populated 
right now!! :D

Thanks again

On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:

> Make sure you're not running into a case sensitivity problem, some stuff
> in DIH is case sensitive (and some stuff gets capitalized by the jdbc).
> Try using listing.ID instead of listing.id.
> On a side note, if you're using mysql, you might want to look at the
> CONCAT_WS function.
> You might also want to look into a different approach than sub-entities
> -
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
> c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
> %3E
> 
> Ephraim Ofir
> 
> -Original Message-
> From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 2:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> I have tried a more elaborate join also following the features example
> of the DIH example but same result - SQL works fine directly but Solr is
> not indexing the array of full_names per Listing, e.g.
> 
> 
> 
>   query="select * from listing_contacts where
> listing_id = '${listing.id}'">
>   query="select concat(first_name,
> concat(' ', last_name)) as full_name from contacts where id =
> '${listing_contact.contact_id}'">
>   
>   
>
> 
> 
> 
> Am I missing the obvious?
> 
> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
> 
>> Hello list,
>> 
>> I've been successful with DIH to a large extent but a seemingly simple
> extra column I need is posing problems. In a nutshell I have 2 entities
> let's say - Listing habtm Contact. I have copied the relevant parts of
> the configs below.
>> 
>> I have run my SQL for the sub-entity Contact and this is produces
> correct results. No errors are given by Solr on running the import. Yet
> no records are being set with the contacts array.
>> 
>> I have taken out my sub-entity config and replaced it with a simple
> template value just to check and values then come through OK.
>> 
>> So it certainly seems limited to my query or query config somehow. I
> followed roughly the example of the DIH bundled example.
>> 
>> DIH.xml
>> ===
>> 
>> 
>> ...
>> > query="select concat(c.first_name, concat(' ', c.last_name)) as
> full_name from contacts c inner join listing_contacts lc on c.id =
> lc.contact_id where lc.listing_id = '${listing.id}'">
>> 
>> 
>> 
>> SCHEMA.XML
>> 
>>  multiValued="true" required="false" />
>> 
>> 
>> Any tips appreciated.
> 



DIH sub-entity not indexing

2010-10-04 Thread Allistair Crossley
Hello list,

I've been successful with DIH to a large extent but a seemingly simple extra 
column I need is posing problems. In a nutshell I have 2 entities let's say - 
Listing habtm Contact. I have copied the relevant parts of the configs below.

I have run my SQL for the sub-entity Contact and this is produces correct 
results. No errors are given by Solr on running the import. Yet no records are 
being set with the contacts array.

I have taken out my sub-entity config and replaced it with a simple template 
value just to check and values then come through OK.

So it certainly seems limited to my query or query config somehow. I followed 
roughly the example of the DIH bundled example.

DIH.xml
===


  ...
  



SCHEMA.XML




Any tips appreciated.

Re: UpdateXmlMessage

2010-10-04 Thread Tod

On 10/1/2010 11:33 PM, Lance Norskog wrote:

Yes. stream.file and stream.url are independent of the request handler.
They do their magic at the very top level of the request.

However, there are no unit tests for these features, but they are widely
used.



Sorry Lance, are you agreeing that I can't or that I can?  If I can, I'm 
doing something wrong.  I'm specifying stream.url as its own field in 
the XML like:



 
  I am the author
  I am the title
  http://www.test.com/myOfficeDoc.doc
  .
  .
  .
 


The wiki docs were a little sparse on this one.

- Tod





Tod wrote:

I can do this using GET:

http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery%3Eoffice:Bridgewater%3C/query%3E%3C/delete%3E

http://localhost:8983/solr/update?stream.body=%3Ccommit/%3E

... but can I pass a stream.url parameter using an UpdateXmlMessage? I
looked at the schema and I think the answer is no but just wanted to
check.


TIA






Re: DIH sub-entity not indexing

2010-10-04 Thread Allistair Crossley
Very clever thinking indeed. Well, that's certainly revealed the problem ... 
${listing.id} is empty on my sub-entity query ... 

And this because I prefix the indexed ID with a letter



This appears to modify the internal value of $listing.id for subsequent uses.

Well, I can work around this now. Thanks!

On Oct 4, 2010, at 9:35 AM, Stefan Matheis wrote:

> Allistair,
> 
> Indeed, I have a column that already works using a lower-case id. I wish
>> I could debug it somehow - see the SQL? Something particular about this
>> config it is not liking.
>> 
> 
> you may want to try the MySQL Query-Log, to check which Queries are
> performed?
> http://dev.mysql.com/doc/refman/5.1/en/query-log.html



Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Ahmet Arslan
> What TokenFilters would split "electric吉他" into
> "electric" & "吉他"?

Is it possible to write a regex to capture Chinese text? (Unicode range?)

If yes, you can use PatternReplaceFilter to transform electric吉他 into 
electric_吉他.



After that WordDelimeterFilterFactory can produce two adjacent tokens.

But may be using a custom filter can be more easy.





Re: solr-user

2010-10-04 Thread Erick Erickson
I suspect you're not actually including the path to those jars.
SolrException should be in your solrj jar file. You can test this
by executing "jar -tf apacheBLAHBLAH.jar" which will dump
all the class names in the jar file. I'm assuming that you're
really including the version for the * in the solrj jar file here

So I'd guess it's a classpath issue and you're not really including
what you think you are

HTH
Erick

On Fri, Oct 1, 2010 at 11:28 PM, ankita shinde wrote:

> -- Forwarded message --
> From: ankita shinde 
> Date: Sat, Oct 2, 2010 at 8:54 AM
> Subject: solr-user
> To: solr-user@lucene.apache.org
>
>
> hello,
>
> I am trying to use solrj for interfacing with solr. I am trying to run the
> SolrjTest example. I have included all the following  jar files-
>
>
>   - commons-codec-1.3.jar
>   - commons-fileupload-1.2.1.jar
>   - commons-httpclient-3.1.jar
>   - commons-io-1.4.jar
>   - geronimo-stax-api_1.0_spec-1.0.1.jar
>   - apache-solr-solrj-*.jar
>   - wstx-asl-3.2.7.jar
>   - slf4j-api-1.5.5.jar
>   - slf4j-simple-1.5.5.jar
>
>
>
>
>  But its giving me error as 'NoClassDefFoundError:
> org/apache/solr/client/solrj/SolrServerException'.
> Can anyone tell me where did i go wrong?
>


Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Ahmet Arslan
> Does anyone know how to deal with these 2 issues when using
> NGramFilterFactory for autocomplete?
> 
> 1) hyphens - if user types "ema" or "e-ma" I want to
> suggest "email"
> 
> 2) accents - if user types "herme"  want to suggest
> "Hermès"

Accents can be removed with using MappingCharFilterFactory before the 
tokenizer. (both index and query time)



I am not sure if this is most elegant solution but you can replace - with "" 
uing MappingCharFilterFactory too. It satisfies what you describe in 1.

But generally NGramFilterFactory produces a lot of tokens. I mean query er can 
return hermes. May be EdgeNGramFilterFactory can be more suitable for 
auto-complete task. At least it guarantees that some word is starting with that 
character sequence.



 


More like this and terms positions

2010-10-04 Thread Xavier Schepler

Hi,

does the more like this search uses terms positions information in the 
score formula ?


Re: solr-user

2010-10-04 Thread Allistair Crossley
I updated the SolrJ JAR requirements to be clearer on the wiki page given how 
many of these SolrJ emails I saw coming through since joining the list. I just 
created a test java class and imported the removed JARs until I found out the 
minimal set required.

On Oct 4, 2010, at 8:27 AM, Erick Erickson wrote:

> I suspect you're not actually including the path to those jars.
> SolrException should be in your solrj jar file. You can test this
> by executing "jar -tf apacheBLAHBLAH.jar" which will dump
> all the class names in the jar file. I'm assuming that you're
> really including the version for the * in the solrj jar file here
> 
> So I'd guess it's a classpath issue and you're not really including
> what you think you are
> 
> HTH
> Erick
> 
> On Fri, Oct 1, 2010 at 11:28 PM, ankita shinde 
> wrote:
> 
>> -- Forwarded message --
>> From: ankita shinde 
>> Date: Sat, Oct 2, 2010 at 8:54 AM
>> Subject: solr-user
>> To: solr-user@lucene.apache.org
>> 
>> 
>> hello,
>> 
>> I am trying to use solrj for interfacing with solr. I am trying to run the
>> SolrjTest example. I have included all the following  jar files-
>> 
>> 
>>  - commons-codec-1.3.jar
>>  - commons-fileupload-1.2.1.jar
>>  - commons-httpclient-3.1.jar
>>  - commons-io-1.4.jar
>>  - geronimo-stax-api_1.0_spec-1.0.1.jar
>>  - apache-solr-solrj-*.jar
>>  - wstx-asl-3.2.7.jar
>>  - slf4j-api-1.5.5.jar
>>  - slf4j-simple-1.5.5.jar
>> 
>> 
>> 
>> 
>> But its giving me error as 'NoClassDefFoundError:
>> org/apache/solr/client/solrj/SolrServerException'.
>> Can anyone tell me where did i go wrong?
>> 



Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Ahmet Arslan
> I agree with the issues with NGramFilterFactory you pointed
> out and I really want to avoid using it. But the problem is
> that I have Chinese tags like "电吉他" and multi-lingual
> tags like "electric吉他".

I got your point. You want to retrieve "electric吉他" with the query 吉他. That's 
why you don't want EdgeNGram.
If this is the only reason for NGram, I think you can transform "electric吉他" 
into two tokens "electric" "吉他" in TokenFilter(s) and apply EdgeNGram approach.






Re: DIH sub-entity not indexing

2010-10-04 Thread Allistair Crossley
I have tried a more elaborate join also following the features example of the 
DIH example but same result - SQL works fine directly but Solr is not indexing 
the array of full_names per Listing, e.g.











Am I missing the obvious?

On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:

> Hello list,
> 
> I've been successful with DIH to a large extent but a seemingly simple extra 
> column I need is posing problems. In a nutshell I have 2 entities let's say - 
> Listing habtm Contact. I have copied the relevant parts of the configs below.
> 
> I have run my SQL for the sub-entity Contact and this is produces correct 
> results. No errors are given by Solr on running the import. Yet no records 
> are being set with the contacts array.
> 
> I have taken out my sub-entity config and replaced it with a simple template 
> value just to check and values then come through OK.
> 
> So it certainly seems limited to my query or query config somehow. I followed 
> roughly the example of the DIH bundled example.
> 
> DIH.xml
> ===
> 
> 
>  ...
>   query="select concat(c.first_name, concat(' ', c.last_name)) as full_name 
> from contacts c inner join listing_contacts lc on c.id = lc.contact_id where 
> lc.listing_id = '${listing.id}'">
> 
> 
> 
> SCHEMA.XML
> 
>  multiValued="true" required="false" />
> 
> 
> Any tips appreciated.



Re: Autosuggest with inner phrases

2010-10-04 Thread Otis Gospodnetic
Or, , this: http://www.sematext.com/products/autocomplete/index.html , 
which happens to use the same "bass" examples as the original poster. :)

You can see this Autosuggest in action on http://search-lucene.com/ .

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch



- Original Message 
> From: Jason Rutherglen 
> To: solr-user@lucene.apache.org
> Sent: Sat, October 2, 2010 3:40:52 PM
> Subject: Re: Autosuggest with inner phrases
> 
> This's what yer lookin' for:
> 
>http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>/
> 
> On  Sat, Oct 2, 2010 at 3:14 AM, sivaprasad   
>wrote:
> >
> > Hi ,
> > I implemented the auto suggest using terms  component.But the suggestions 
are
> > coming from the starting of the  word.But i want inner phrases also.For
> > example, if I type "bass"  Auto-Complete should offer suggestions that
> > include "bass fishing"  or  "bass guitar", and even "sea bass" (note how
> > "bass" is not necessarily  the first word).
> >
> > How can i achieve this using solr's terms  component.
> >
> > Regards,
> > Siva
> > --
> > View this  message in context: 
>http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html
>
> >  Sent from the Solr - User mailing list archive at Nabble.com.
> >
> 


Re: Highlighting match term in bold rather than italic

2010-10-04 Thread Otis Gospodnetic
Hi,

It's a matter of the config.  Have a look at the highlighter section of 
solrconfig.xml.

Otis 

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: "efr...@gmail.com" 
> To: solr-user@lucene.apache.org
> Sent: Thu, September 30, 2010 5:54:19 PM
> Subject: Highlighting match term in bold rather than italic
> 
> Hi all -
> 
> Does anyone know how to produce solr results where the match  term is
> highlighted in bold rather than italic?
> 
> thanks in  advance,
> 
> Brad
> 


Re: More like this and terms positions

2010-10-04 Thread Robert Muir
On Mon, Oct 4, 2010 at 10:16 AM, Xavier Schepler <
xavier.schep...@sciences-po.fr> wrote:

> Hi,
>
> does the more like this search uses terms positions information in the
> score formula ?
>

no, it would be nice if it did use them though (based upon query terms),
seems like it would yield improvements.

http://sifaka.cs.uiuc.edu/~ylv2/pub/sigir10-prm.pdf

-- 
Robert Muir
rcm...@gmail.com


Re: Multiple masters and replication between masters?

2010-10-04 Thread Otis Gospodnetic
Hi,

Would this help you:

http://wiki.apache.org/solr/SolrReplication#Setting_up_a_Repeater


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Arunkumar Ayyavu 
> To: solr-user@lucene.apache.org
> Sent: Sun, October 3, 2010 2:55:22 PM
> Subject: Multiple masters and replication between masters?
> 
> I'm looking at setting up multiple masters for redundancy (for  index
> updates). I found the thread in this link
>(http://www.lucidimagination.com/search/document/68ac303ce8425506/multiple_masters_solr_replication_1_4)
>)
> discussed  this subject more than a year back. Does Solr support such
> configuration  today?
> 
> And does Solr support replication between masters? Otherwise,  I'll
> have to post the updates to all masters to keep the indexes of  masters
> in sync. Does SolrCloud address this case? (Please note it is  too
> early for me to read about SolrCloud as I'm still learning  Solr)
> 
> Thanks in advance.
> 
> --
> Arun
> 


Re: More like this and terms positions

2010-10-04 Thread Xavier Schepler

On 04/10/2010 16:40, Robert Muir wrote:

On Mon, Oct 4, 2010 at 10:16 AM, Xavier Schepler<
xavier.schep...@sciences-po.fr>  wrote:

   

Hi,

does the more like this search uses terms positions information in the
score formula ?

 

no, it would be nice if it did use them though (based upon query terms),
seems like it would yield improvements.

http://sifaka.cs.uiuc.edu/~ylv2/pub/sigir10-prm.pdf

   

maybe in a next solr version ?


RE: solrj

2010-10-04 Thread Xin Li
I asked the exact question the day before. If you or anyone else has
pointer to the solution, please share on the mail list. For now, I am
using Perl script instead to query Solr server.

Thanks,
Xin

-Original Message-
From: ankita shinde [mailto:ankitashinde...@gmail.com] 
Sent: Saturday, October 02, 2010 2:30 PM
To: solr-user@lucene.apache.org
Subject: solrj

hello,

I am trying to use solrj for interfacing with solr. I am trying to run
the
SolrjTest example. I have included all the following  jar files-


   - commons-codec-1.3.jar
   - commons-fileupload-1.2.1.jar
   - commons-httpclient-3.1.jar
   - commons-io-1.4.jar
   - geronimo-stax-api_1.0_spec-1.0.1.jar
   - apache-solr-solrj-*.jar
   - wstx-asl-3.2.7.jar
   - slf4j-api-1.5.5.jar
   - slf4j-simple-1.5.5.jar


*My SolrjTest file is as follows:*

import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrDocument;
import java.util.Map;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrDocument;
import java.util.Map;
import java.util.Iterator;
import java.util.List;
import java.util.ArrayList;
import java.util.HashMap;

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.FacetField;


class SolrjTest
{
public void query(String q)
{
CommonsHttpSolrServer server = null;

try
{
server = new
CommonsHttpSolrServer("http://localhost:8983/solr/
");
}
catch(Exception e)
{
e.printStackTrace();
}

SolrQuery query = new SolrQuery();
query.setQuery(q);
query.setQueryType("dismax");
query.setFacet(true);
query.addFacetField("lastname");
query.addFacetField("locality4");
query.setFacetMinCount(2);
query.setIncludeScore(true);

try
{
QueryResponse qr = server.query(query);

SolrDocumentList sdl = qr.getResults();

System.out.println("Found: " + sdl.getNumFound());
System.out.println("Start: " + sdl.getStart());
System.out.println("Max Score: " + sdl.getMaxScore());
System.out.println("");

ArrayList> hitsOnPage = new
ArrayList>();

for(SolrDocument d : sdl)
{
HashMap values = new HashMap();

for(Iterator> i =
d.iterator();
i.hasNext(); )
{
Map.Entry e2 = i.next();

values.put(e2.getKey(), e2.getValue());
}

hitsOnPage.add(values);
System.out.println(values.get("displayname") + " (" +
values.get("displayphone") + ")");
}

List facets = qr.getFacetFields();

for(FacetField facet : facets)
{
List facetEntries = facet.getValues();

for(FacetField.Count fcount : facetEntries)
{
System.out.println(fcount.getName() + ": " +
fcount.getCount());
}
}
}
catch (SolrServerException e)
{
e.printStackTrace();
}

}

public static void main(String[] args)
{
SolrjTest solrj = new SolrjTest();
solrj.query(args[0]);
}
}





 But its giving me error as 'NoClassDefFoundError:
org/apache/solr/client/solrj/SolrServerException'.
Can anyone tell me where did i go wrong?

This electronic mail message contains information that (a) is or
may be CONFIDENTIAL, PROPRIETARY IN NATURE, OR OTHERWISE
PROTECTED
BY LAW FROM DISCLOSURE, and (b) is intended only for the use of
the
addressee(s) named herein.  If you are not an intended recipient,
please contact the sender immediately and take the steps
necessary
to delete the message completely from your computer system.

Not Intended as a Substitute for a Writing: Notwithstanding the
Uniform Electronic Transaction Act or any other law of similar
effect, absent an express statement to the contrary, this e-mail
message, its contents, and any attachments hereto are not
intended
to represent an offer or acceptance to enter into a contract and
are not otherwise intended to bind this sender,
barnesandnoble.com
llc, barnesandnoble.com inc. or any other person or entity.


Re: solrj

2010-10-04 Thread Allistair Crossley
i rewrote the top jar section at

http://wiki.apache.org/solr/Solrj

and the following code then runs fine.

import java.net.MalformedURLException;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;

class TestSolrQuery {

public static void main(String[] args) {

String url = "http://localhost:8983/solr";;
SolrServer server = null;

try { 
server = new CommonsHttpSolrServer(url);
} catch (MalformedURLException e) {
System.out.println(e);
System.exit(1);
}

SolrQuery query = new SolrQuery();
query.setQuery("*:*");

QueryResponse rsp = null;
try { 
rsp = server.query(query);
} catch(SolrServerException e) {
System.out.println(e);
System.exit(1);
}

SolrDocumentList docs = rsp.getResults();
for (SolrDocument doc : docs) {
System.out.println(doc.toString());
}
}
}


On Oct 4, 2010, at 11:26 AM, Xin Li wrote:

> I asked the exact question the day before. If you or anyone else has
> pointer to the solution, please share on the mail list. For now, I am
> using Perl script instead to query Solr server.
> 
> Thanks,
> Xin
> 
> -Original Message-
> From: ankita shinde [mailto:ankitashinde...@gmail.com] 
> Sent: Saturday, October 02, 2010 2:30 PM
> To: solr-user@lucene.apache.org
> Subject: solrj
> 
> hello,
> 
> I am trying to use solrj for interfacing with solr. I am trying to run
> the
> SolrjTest example. I have included all the following  jar files-
> 
> 
>   - commons-codec-1.3.jar
>   - commons-fileupload-1.2.1.jar
>   - commons-httpclient-3.1.jar
>   - commons-io-1.4.jar
>   - geronimo-stax-api_1.0_spec-1.0.1.jar
>   - apache-solr-solrj-*.jar
>   - wstx-asl-3.2.7.jar
>   - slf4j-api-1.5.5.jar
>   - slf4j-simple-1.5.5.jar
> 
> 
> *My SolrjTest file is as follows:*
> 
> import org.apache.solr.common.SolrDocumentList;
> import org.apache.solr.common.SolrDocument;
> import java.util.Map;
> import org.apache.solr.common.SolrDocumentList;
> import org.apache.solr.common.SolrDocument;
> import java.util.Map;
> import java.util.Iterator;
> import java.util.List;
> import java.util.ArrayList;
> import java.util.HashMap;
> 
> import org.apache.solr.client.solrj.SolrServerException;
> import org.apache.solr.client.solrj.SolrQuery;
> import org.apache.solr.client.solrj.response.QueryResponse;
> import org.apache.solr.client.solrj.response.FacetField;
> 
> 
> class SolrjTest
> {
>public void query(String q)
>{
>CommonsHttpSolrServer server = null;
> 
>try
>{
>server = new
> CommonsHttpSolrServer("http://localhost:8983/solr/
> ");
>}
>catch(Exception e)
>{
>e.printStackTrace();
>}
> 
>SolrQuery query = new SolrQuery();
>query.setQuery(q);
>query.setQueryType("dismax");
>query.setFacet(true);
>query.addFacetField("lastname");
>query.addFacetField("locality4");
>query.setFacetMinCount(2);
>query.setIncludeScore(true);
> 
>try
>{
>QueryResponse qr = server.query(query);
> 
>SolrDocumentList sdl = qr.getResults();
> 
>System.out.println("Found: " + sdl.getNumFound());
>System.out.println("Start: " + sdl.getStart());
>System.out.println("Max Score: " + sdl.getMaxScore());
>System.out.println("");
> 
>ArrayList> hitsOnPage = new
> ArrayList>();
> 
>for(SolrDocument d : sdl)
>{
>HashMap values = new HashMap Object>();
> 
>for(Iterator> i =
> d.iterator();
> i.hasNext(); )
>{
>Map.Entry e2 = i.next();
> 
>values.put(e2.getKey(), e2.getValue());
>}
> 
>hitsOnPage.add(values);
>System.out.println(values.get("displayname") + " (" +
> values.get("displayphone") + ")");
>}
> 
>List facets = qr.getFacetFields();
> 
>for(FacetField facet : facets)
>{
>List facetEntries = facet.getValues();
> 
>for(FacetField.Count fcount : facetEntries)
>{
>System.out.println(fcount.getName() + ": " +
> fcount.getCount());
>}
>}
>}
> 

RE: multi level faceting

2010-10-04 Thread Nguyen, Vincent (CDC/OD/OADS) (CTR)
Ok.  Thanks for the quick response.

Vincent Vu Nguyen
Division of Science Quality and Translation
Office of the Associate Director for Science
Centers for Disease Control and Prevention (CDC)
404-498-6154
Century Bldg 2400
Atlanta, GA 30329 


-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
Sent: Monday, October 04, 2010 9:40 AM
To: solr-user@lucene.apache.org
Subject: Re: multi level faceting

I think that is just sending 2 fq facet queries through. In Solr PHP I
would do that with, e.g.

$params['facet'] = true;
$params['facet.fields'] = array('Size');
$params['fq'] => array('sex' => array('Men', 'Women'));

but yes i think you'd have to send through what the current facet query
is and add it to your next drill-down

On Oct 4, 2010, at 9:36 AM, Nguyen, Vincent (CDC/OD/OADS) (CTR) wrote:

> Hi,
> 
> 
> 
> I was wondering if there's a way to display facet options based on
> previous facet values.  For example, I've seen many shopping sites
where
> a user can facet by "Mens" or "Womens" apparel, then be shown "sizes"
to
> facet by (for Men or Women only - whichever they chose).  
> 
> 
> 
> Is this something that would have to be handled at the application
> level?
> 
> 
> 
> Vincent Vu Nguyen
> 
> 
> 
> 
> 




RE: solrj

2010-10-04 Thread Xin Li
Thanks, Allistair. I will give it a try later today. 


-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
Sent: Monday, October 04, 2010 11:31 AM
To: solr-user@lucene.apache.org
Subject: Re: solrj

i rewrote the top jar section at

http://wiki.apache.org/solr/Solrj

and the following code then runs fine.

import java.net.MalformedURLException;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;

class TestSolrQuery {

public static void main(String[] args) {

String url = "http://localhost:8983/solr";;
SolrServer server = null;

try { 
server = new CommonsHttpSolrServer(url);
} catch (MalformedURLException e) {
System.out.println(e);
System.exit(1);
}

SolrQuery query = new SolrQuery();
query.setQuery("*:*");

QueryResponse rsp = null;
try { 
rsp = server.query(query);
} catch(SolrServerException e) {
System.out.println(e);
System.exit(1);
}

SolrDocumentList docs = rsp.getResults();
for (SolrDocument doc : docs) {
System.out.println(doc.toString());
}
}
}


On Oct 4, 2010, at 11:26 AM, Xin Li wrote:

> I asked the exact question the day before. If you or anyone else has
> pointer to the solution, please share on the mail list. For now, I am
> using Perl script instead to query Solr server.
> 
> Thanks,
> Xin
> 
> -Original Message-
> From: ankita shinde [mailto:ankitashinde...@gmail.com] 
> Sent: Saturday, October 02, 2010 2:30 PM
> To: solr-user@lucene.apache.org
> Subject: solrj
> 
> hello,
> 
> I am trying to use solrj for interfacing with solr. I am trying to run
> the
> SolrjTest example. I have included all the following  jar files-
> 
> 
>   - commons-codec-1.3.jar
>   - commons-fileupload-1.2.1.jar
>   - commons-httpclient-3.1.jar
>   - commons-io-1.4.jar
>   - geronimo-stax-api_1.0_spec-1.0.1.jar
>   - apache-solr-solrj-*.jar
>   - wstx-asl-3.2.7.jar
>   - slf4j-api-1.5.5.jar
>   - slf4j-simple-1.5.5.jar
> 
> 
> *My SolrjTest file is as follows:*
> 
> import org.apache.solr.common.SolrDocumentList;
> import org.apache.solr.common.SolrDocument;
> import java.util.Map;
> import org.apache.solr.common.SolrDocumentList;
> import org.apache.solr.common.SolrDocument;
> import java.util.Map;
> import java.util.Iterator;
> import java.util.List;
> import java.util.ArrayList;
> import java.util.HashMap;
> 
> import org.apache.solr.client.solrj.SolrServerException;
> import org.apache.solr.client.solrj.SolrQuery;
> import org.apache.solr.client.solrj.response.QueryResponse;
> import org.apache.solr.client.solrj.response.FacetField;
> 
> 
> class SolrjTest
> {
>public void query(String q)
>{
>CommonsHttpSolrServer server = null;
> 
>try
>{
>server = new
> CommonsHttpSolrServer("http://localhost:8983/solr/
> ");
>}
>catch(Exception e)
>{
>e.printStackTrace();
>}
> 
>SolrQuery query = new SolrQuery();
>query.setQuery(q);
>query.setQueryType("dismax");
>query.setFacet(true);
>query.addFacetField("lastname");
>query.addFacetField("locality4");
>query.setFacetMinCount(2);
>query.setIncludeScore(true);
> 
>try
>{
>QueryResponse qr = server.query(query);
> 
>SolrDocumentList sdl = qr.getResults();
> 
>System.out.println("Found: " + sdl.getNumFound());
>System.out.println("Start: " + sdl.getStart());
>System.out.println("Max Score: " + sdl.getMaxScore());
>System.out.println("");
> 
>ArrayList> hitsOnPage = new
> ArrayList>();
> 
>for(SolrDocument d : sdl)
>{
>HashMap values = new HashMap Object>();
> 
>for(Iterator> i =
> d.iterator();
> i.hasNext(); )
>{
>Map.Entry e2 = i.next();
> 
>values.put(e2.getKey(), e2.getValue());
>}
> 
>hitsOnPage.add(values);
>System.out.println(values.get("displayname") + " (" +
> values.get("displayphone") + ")");
>}
> 
>List facets = qr.getFacetFields();
> 
>for(FacetField facet : facets)
>{
>List facetEntries =
facet.get

RE: multi level faceting

2010-10-04 Thread Jason Brown
Yes, by adding fq back into the main query you will get results increasingly 
filtered each time.

You may run into an issue if you are displaying facet counts, as the facet part 
of the query will also obey the increasingly filtered fq, and so not display 
counts for other categories anymore from the chosen facet (depends if you need 
to display counts from a facet once the first value from the facet has been 
chosen if you get my drift). Local params are a way to deal with this by not 
subjecting the facet count to the same fq restriction (but allowing the search 
results to obey it).



-Original Message-
From: Nguyen, Vincent (CDC/OD/OADS) (CTR) [mailto:v...@cdc.gov]
Sent: Mon 04/10/2010 16:34
To: solr-user@lucene.apache.org
Subject: RE: multi level faceting
 
Ok.  Thanks for the quick response.

Vincent Vu Nguyen
Division of Science Quality and Translation
Office of the Associate Director for Science
Centers for Disease Control and Prevention (CDC)
404-498-6154
Century Bldg 2400
Atlanta, GA 30329 


-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk] 
Sent: Monday, October 04, 2010 9:40 AM
To: solr-user@lucene.apache.org
Subject: Re: multi level faceting

I think that is just sending 2 fq facet queries through. In Solr PHP I
would do that with, e.g.

$params['facet'] = true;
$params['facet.fields'] = array('Size');
$params['fq'] => array('sex' => array('Men', 'Women'));

but yes i think you'd have to send through what the current facet query
is and add it to your next drill-down

On Oct 4, 2010, at 9:36 AM, Nguyen, Vincent (CDC/OD/OADS) (CTR) wrote:

> Hi,
> 
> 
> 
> I was wondering if there's a way to display facet options based on
> previous facet values.  For example, I've seen many shopping sites
where
> a user can facet by "Mens" or "Womens" apparel, then be shown "sizes"
to
> facet by (for Men or Women only - whichever they chose).  
> 
> 
> 
> Is this something that would have to be handled at the application
> level?
> 
> 
> 
> Vincent Vu Nguyen
> 
> 
> 
> 
> 




If you wish to view the St. James's Place email disclaimer, please use the link 
below

http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer


Dismax Filtering Hyphens? Why is this not working? How do I debug Dismax?

2010-10-04 Thread Scott Gonyea
Wow, this is probably the most annoying Solr issue I've *ever* dealt
with. First question: How do I debug Dismax, and its query handling?

Issue: When I query against this StrField, I am attempting to do an
*exact* match...  Albeit one that is case-insensitive :).  So, 90%
exact.  It works in a majority of cases.  Indeed, I am teling Solr
that this field is my uniqueField and it enforces uniqueness
perfectly.  The issue comes about when I try to query a document,
based on a key in this field, and the key I'm using has hyphens
(dashes) in it.  Then I get zero results.  Very frustrating.

The keys will always be a URL.  IE,
"http://helloworld.abc/I-ruin-your-queries-aghghaahahaagcry";

Here's my configuration info...  schema.xml (the URL exists twice;
once in 'idstr' format, for uniqueness, and once in the 'url' form
below. I am querying against the 'idstr' field):


  

  
  


  





  





  id
  content
  


Yes, the PatternTokenizerFactory is inefficient for doing what I
wanted above. It was a quick hack, while I sought something to do
exactly what I'm doing above.  IE, exact / WHOLE string... but lower
case.

Here's my solrconfig.xml:



  
dismax
explicit
0.01
 content^1.5 anchor^0.3 title^1.2
mcode^1.0 site_id^1.0 priority^1.0
 * 
true
*:*
content title
0
title
regex3
  



And, finally, when I run that sample URL through the query analyzer...
 here's the output (copied from the HTML)... I appreciate any/all help
anyone can provide.  Seriously.  I'll love you forever :(  :


Index Analyzer
org.apache.solr.analysis.PatternTokenizerFactory   null


term position
1

term text
http://helloworld.abc/I-ruin-your-queries-aghghaahahaagcry

term type
word

source start,end
0,58

payload


org.apache.solr.analysis.LowerCaseFilterFactory   {}


term position
1

term text
http://helloworld.abc/i-ruin-your-queries-aghghaahahaagcry

term type
word

source start,end
0,58

payload


Query Analyzer
org.apache.solr.analysis.PatternTokenizerFactory   null


term position
1

term text
http://helloworld.abc/I-ruin-your-queries-aghghaahahaagcry

term type
word

source start,end
0,58

payload


org.apache.solr.analysis.LowerCaseFilterFactory   {}


term position
1

term text
http://helloworld.abc/i-ruin-your-queries-aghghaahahaagcry

term type
word

source start,end
0,58

payload




having problem about Solr Date Field.

2010-10-04 Thread Kouta Osabe
Hi,All

I have a problem about Solr Date Field.

The problem is like below.

SolrBean foo = new Bean();
// The type of pubDate property is "java.util.Date" and rs means
"java.sql.ResultSet" so rs.getDate("pub_date") retuns java.sql.Date
Object.
bean.pubDate = rs.getDate("pub_date");

the value of pub_date column comes from MySQL and actually value is
"2010-10-05 00:00:00".

I regist "foo" bean to Solr through SolrJ like "new
CommonsHttpSolrServer().addBean(foo)"

I expected "2010-10-05 00:00:00" to display by Solr Admin but
"2010-10-04T15:00:00Z" displayed on Solr Admin.

is this timezone problem?(I live in Tokyo Japan).

Any Hint, please!


Re: Dismax Filtering Hyphens? Why is this not working? How do I debug Dismax?

2010-10-04 Thread Ahmet Arslan
> 
>      name="idstr"   class="solr.StrField">
>       
>          class="solr.PatternTokenizerFactory" pattern="(.*)"
> group="1"/>
>            class="solr.LowerCaseFilterFactory"/>
>       

This definition is invalid. You cannot use charfilter/tokenizer/tokenfilter 
with solr.StrField.

But it is interesting that (i just tested) analysis.jsp (1.4.1) displays as if 
its working. But if you observe at /schema.jsp you will see that real indexed 
values are not lowercased. 

You can use this definition instead:


 
   
 
 
 






Re: having problem about Solr Date Field.

2010-10-04 Thread Gora Mohanty
On Mon, Oct 4, 2010 at 10:24 PM, Kouta Osabe  wrote:
> Hi,All
>
> I have a problem about Solr Date Field.
[...]

> the value of pub_date column comes from MySQL and actually value is
> "2010-10-05 00:00:00".
>
> I regist "foo" bean to Solr through SolrJ like "new
> CommonsHttpSolrServer().addBean(foo)"
>
> I expected "2010-10-05 00:00:00" to display by Solr Admin but
> "2010-10-04T15:00:00Z" displayed on Solr Admin.
>
> is this timezone problem?(I live in Tokyo Japan).
[...]

Yes, it is a timezone issue. Japan is UTC + 9h, which is the offset
that you are seeing.

As far as I know, Solr does not itself do any timezone conversion, but depends
on the time retrieved by the driver. If you want UTC, instead of local time, you
might want to look into getDate( columnName, Calendar ). Please see
 
http://download.oracle.com/javase/1.4.2/docs/api/java/sql/ResultSet.html#getDate(int,%20java.util.Calendar)

Regards,
Gora


Re: having problem about Solr Date Field.

2010-10-04 Thread Ahmet Arslan
> I expected "2010-10-05 00:00:00" to display by Solr Admin
> but
> "2010-10-04T15:00:00Z" displayed on Solr Admin.
> 
> is this timezone problem?(I live in Tokyo Japan).

Probably. Solr stores/converts dates in/to UTC timezone. 


  


Re: Dismax Filtering Hyphens? Why is this not working? How do I debug Dismax?

2010-10-04 Thread Scott Gonyea
Wow, that's pretty infuriating.  Thank you for the suggestion.  I
added it to the Wiki, with the hope that if it contains misinformation
then someone will correct it and, consequently, save me from another
one of these experiences :)  (...and to also document that, hey, there
is a tokenizer which treats the entire field as an exact value.)

Will go this route and re-index everything back into Solr...again...sigh.

Scott

On Mon, Oct 4, 2010 at 10:07 AM, Ahmet Arslan  wrote:
>>
>>     > name="idstr"   class="solr.StrField">
>>       
>>         > class="solr.PatternTokenizerFactory" pattern="(.*)"
>> group="1"/>
>>           > class="solr.LowerCaseFilterFactory"/>
>>       
>
> This definition is invalid. You cannot use charfilter/tokenizer/tokenfilter 
> with solr.StrField.
>
> But it is interesting that (i just tested) analysis.jsp (1.4.1) displays as 
> if its working. But if you observe at /schema.jsp you will see that real 
> indexed values are not lowercased.
>
> You can use this definition instead:
>
> 
>  
>   
>   
>   
>  
> 
>
>
>
>


Re: SolrCore / Index Searcher Instances

2010-10-04 Thread entdeveloper

Make sense. However, one of the reasons I was asking was that we've
configured Solr to use RAMDirectory and it appears that it loads the index
into memory twice. I suspect the first time is for warming firstSearcher and
the second time is for warming newSearcher. It makes our jvm memory
requirements > 2x indexSize, which for us is a lot since indexSize=8GB.

I'm wondering why it either a) loads the index twice, or b) seems to not
release the 2nd load of the RAMDirectory in memory
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCore-Index-Searcher-Instances-tp1599373p1631329.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to update a distributed index?

2010-10-04 Thread bbarani

Hi,

We are maintaining multiple SOLR index, one for each source (the source data
is too huge to be stored in a single index) and we are using shards to do a
distributed search across all the SOLR index.

We also update the SOLR documents (which was already indexed) using XML push

http://server:8983/solr/db/update?stream.body=9-MDSTR-17661666field
name="related_uid">6-DSTR-tx2sbtco01-INTCSTP-INTCSTP-Oracle<

http://server:8983/solr/db/update?stream.body=

Now since there are multiple index being maintained, is there a way to push
this XML to a particular index based on type?

Each index maintains the SOLR document corresponding to each type. For eg:
index 1 stored documents (type) , index 2 stores (pictures).

If I send a XML (which will contain the type as one of the attributes) and
if it has documents as the value it should be pushed to index 1 and if it
has pictures it should be pushed to index 2.

Is there a way to update the index using shards?

Any suggestions / ideas would be of great help for me.

Thanks,
Barani


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-update-a-distributed-index-tp1631946p1631946.html
Sent from the Solr - User mailing list archive at Nabble.com.


Using Solr Analyzers in Lucene

2010-10-04 Thread Max Lynch
Hi,
I asked this question a month ago on lucene-user and was referred here.

I have content being analyzed in Solr using these tokenizers and filters:


   
 




  
  




  


Basically I want to be able to search against this index in Lucene with one
of my background searching applications.

My main reason for using Lucene over Solr for this is that I use the
highlighter to keep track of exactly which terms were found which I use for
my own scoring system and I always collect the whole set of found
documents.  I've messed around with using Boosts but it wasn't fine grained
enough and I wasn't able to effectively create a score threshold (would
creating my own scorer be a better idea?)

Is it possible to use this analyzer from Lucene, or at least re-create it in
code?

Thanks.


Re: Using Solr Analyzers in Lucene

2010-10-04 Thread Max Lynch
I have made progress on this by writing my own Analyzer.  I basically added
the TokenFilters that are under each of the solr factory classes.  I had to
copy and paste the WordDelimiterFilter because, of course, it was package
protected.



On Mon, Oct 4, 2010 at 3:05 PM, Max Lynch  wrote:

> Hi,
> I asked this question a month ago on lucene-user and was referred here.
>
> I have content being analyzed in Solr using these tokenizers and filters:
>
>  positionIncrementGap="100">
>
>  
>
>  generateWordParts="0" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> 
>  protected="protwords.txt"/>
>   
>   
> 
>  generateWordParts="0" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> 
>  protected="protwords.txt"/>
>   
> 
>
> Basically I want to be able to search against this index in Lucene with one
> of my background searching applications.
>
> My main reason for using Lucene over Solr for this is that I use the
> highlighter to keep track of exactly which terms were found which I use for
> my own scoring system and I always collect the whole set of found
> documents.  I've messed around with using Boosts but it wasn't fine grained
> enough and I wasn't able to effectively create a score threshold (would
> creating my own scorer be a better idea?)
>
> Is it possible to use this analyzer from Lucene, or at least re-create it
> in code?
>
> Thanks.
>
>


Some Parser Resources / Links, and some related questions

2010-10-04 Thread Mark Bennett
I'm fiddling around with DisMax again, my love/have relationship continues.

Basically I'm hunting for custom parser examples.  But I've also got some
links that might be helpful to others.

Alternate XML Syntax:
Project: XML Query Syntax is being worked on, in SOLR-839
Link: https://issues.apache.org/jira/browse/SOLR-839
Question: Any idea when this be in the mainstream release???

Solr DisMax Refactoring:
Project: The dismax in Solr was refactored to make it easier to override and
reuse, in SOLR-786
Link: https://issues.apache.org/jira/browse/SOLR-786
Status: This has been in the mainstream code for over a year, seems to be
good.
Question: Any resources / examples of making heavy use of this?  Any good
samples?

IBM / Lucene Parser Refactoring:
Project: A team at IBM worked on refactoring the Lucene/Solr parser.
Link: Presentation from '09:
http://files.meetup.com/1460078/M%20Busch%20Lucene%20Meetup%20Sf%20June%2009.pdf
Link: Discussion:
http://www.mail-archive.com/java-...@lucene.apache.org/msg21682.html
Link: Jira: https://issues.apache.org/jira/browse/LUCENE-1567
Question: Is this now considered "mainstream", the new default?  Any
resources / examples of taking advantage of this new refactoring?

Advanced Query Syntax:
Summary: Lets you use curly braces and other advanced syntax for nested
queries and function calls
Link: http://wiki.apache.org/solr/LocalParams
Link:
http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/
Link:
http://www.lucidimagination.com/search/document/ea7b0b27b1b17b1c/re_replacing_fast_functionality_atsesam_no_shinglefilter_exactmatching
Hoss has since answered my particular question about this.

SolrJ Advanced Queries
Summary: SolrJ is a Java API on top of Solr that lets you work entirely in
Java, instead of using REST/CGI arguments and XML output
You can build queries with it.
Link: http://wiki.apache.org/solr/Solrj#Advanced_usage
Question: Any good examples of custom query parsers and SolrJ?  I think some
advanced features aren't always available in SolrJ, it may lag a bit.

--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


Re: Solr with example Jetty and score problem

2010-10-04 Thread Floyd Wu
Hi Chris

Thanks. But do you have any suggest or work-around to deal with it?

Floyd



2010/10/2 Chris Hostetter 

>
> : But when I issue the query with shard(two instances), the response XML
> will
> : be like following.
> : as you can see, that score has bee tranfer to a element  of 
>...
> : 
> : 1.9808292
> : 
>
> The root cause of these seems to be your catchall dynamic field
> declaration...
>
> : : multiValued="true" termVectors="true"
> : termPositions="true"
> : termOffsets="true" omitNorms="false"/>
>
> ...that line (specificly the fact that it's multiValued="true") seems to
> be confusing the results aggregation code.  my guess is that it's
> looping over all the fields, and looking them up in the schema to see if
> they are single/multi valued but not recognizing that "score" is
> special.
>
> https://issues.apache.org/jira/browse/SOLR-2140
>
>
> -Hoss
>
> --
> http://lucenerevolution.org/  ...  October 7-8, Boston
> http://bit.ly/stump-hoss  ...  Stump The Chump!
>
>


Different between Lucid dist. & Apache dist. ?

2010-10-04 Thread Floyd Wu
Hi there,

What is the difference between Lucid distribution of Solr and Apache
distribution?

And can I use Lucid distribution for free in my commercial project?


Re: How to update a distributed index?

2010-10-04 Thread Otis Gospodnetic
Hi,

Even with the sharded index you typically go to the master(s) that has/have all 
your shards and update the appropriate shard (in your case, based on the doc 
type).

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

 


- Original Message 
> From: bbarani 
> To: solr-user@lucene.apache.org
> Sent: Mon, October 4, 2010 3:52:02 PM
> Subject: How to update a distributed index?
> 
> 
> Hi,
> 
> We are maintaining multiple SOLR index, one for each source (the source data
> is too huge to be stored in a single index) and we are using shards to do a
> distributed search across all the SOLR index.
> 
> We also update the SOLR documents (which was already indexed) using XML push
> 
> Now since there are multiple index being maintained, is there a way to push
> this XML to a particular index based on type?
> 
> Each index maintains the SOLR document corresponding to each type. For eg:
> index 1 stored documents (type) , index 2 stores (pictures).
> 
> If I send a XML (which will contain the type as one of the attributes) and
> if it has documents as the value it should be pushed to index 1 and if it
> has pictures it should be pushed to index 2.
> 
> Is there a way to update the index using shards?
> 
> Any suggestions / ideas would be of great help for me.
> 
> Thanks,
> Barani
> 
> 
> -- 
> View this message in context: 
>http://lucene.472066.n3.nabble.com/How-to-update-a-distributed-index-tp1631946p1631946.html
>
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Prioritizing advectives in solr search

2010-10-04 Thread Otis Gospodnetic
Hi,

If you want "blue" to be used in search, then you should not treat it as a 
stopword.

Re payloads: http://search-lucene.com/?q=payload+score
and http://search-lucene.com/?q=payload+score&fc_type=wiki (even better, look 
at 
hit #1)

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Hasnain 
> To: solr-user@lucene.apache.org
> Sent: Mon, October 4, 2010 9:50:46 AM
> Subject: Re: Prioritizing advectives in solr search
> 
> 
> Hi Otis,
> 
>  Thank you for replying,  unfortunately Im unable to fully grasp what
> you are trying to say, can you  please elaborate what is payload with
> adjective terms?
> 
> also Im using  stopwords.txt to stop adjectives, adverbs and verbs, now when
> I search for  "Blue hammers", solr searches for "blue hammers" and "hammers"
> but not  "blue", but the problem here is user can also search for just
> "Blue", then it  wont search for anything...
> 
> any suggestions on this?? 
> 
> -- 
> View  this message in context: 
>http://lucene.472066.n3.nabble.com/Prioritizing-adjectives-in-solr-search-tp1613029p1629725.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
> 


Re: Prioritizing advectives in solr search

2010-10-04 Thread Walter Underwood
I think this is a bad idea. The tf.idf algorithm will already put a higher 
weight on "hammers" than on "blue", because "hammers" will be more rare than 
"blue". Plus, you are making huge assumptions about the queries. In a search 
for "Canon camera", "Canon" is an adjective, but it is the important part of 
the query.

Have you looked at your query logs and which queries are successful and which 
are not?

Don't make radical changes like this unless you can justify them from the logs.

wunder

On Oct 4, 2010, at 8:38 PM, Otis Gospodnetic wrote:

> Hi,
> 
> If you want "blue" to be used in search, then you should not treat it as a 
> stopword.
> 
> Re payloads: http://search-lucene.com/?q=payload+score
> and http://search-lucene.com/?q=payload+score&fc_type=wiki (even better, look 
> at 
> hit #1)
> 
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> - Original Message 
>> From: Hasnain 
>> To: solr-user@lucene.apache.org
>> Sent: Mon, October 4, 2010 9:50:46 AM
>> Subject: Re: Prioritizing advectives in solr search
>> 
>> 
>> Hi Otis,
>> 
>> Thank you for replying,  unfortunately Im unable to fully grasp what
>> you are trying to say, can you  please elaborate what is payload with
>> adjective terms?
>> 
>> also Im using  stopwords.txt to stop adjectives, adverbs and verbs, now when
>> I search for  "Blue hammers", solr searches for "blue hammers" and "hammers"
>> but not  "blue", but the problem here is user can also search for just
>> "Blue", then it  wont search for anything...
>> 
>> any suggestions on this?? 
>> 
>> -- 
>> View  this message in context: 
>> http://lucene.472066.n3.nabble.com/Prioritizing-adjectives-in-solr-search-tp1613029p1629725.html
>> 
>> Sent  from the Solr - User mailing list archive at Nabble.com.
>> 






Re: multi level faceting

2010-10-04 Thread Otis Gospodnetic
Hi,

I *think* this is not what Vincent was after.  If I read the suggestions 
correctly, you are saying to use &fq=x&fq=y -- multiple fqs.
But I think Vincent is wondering how to end up with something that will let him 
create a UI with multi-level facets (with a single request), e.g.

Footwear (100)
  Sneakers (20)
Men (1)
Women (19)

  Dancing shoes (10)
Men (0)
Women (10)
...

If this is what Vincent was after, I'd love to hear suggestions myself. :)

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Jason Brown 
> To: solr-user@lucene.apache.org
> Sent: Mon, October 4, 2010 11:34:56 AM
> Subject: RE: multi level faceting
> 
> Yes, by adding fq back into the main query you will get results increasingly  
>filtered each time.
> 
> You may run into an issue if you are displaying facet  counts, as the facet 
>part of the query will also obey the increasingly filtered  fq, and so not 
>display counts for other categories anymore from the chosen facet  (depends if 
>you need to display counts from a facet once the first value from  the facet 
>has 
>been chosen if you get my drift). Local params are a way to deal  with this by 
>not subjecting the facet count to the same fq restriction (but  allowing the 
>search results to obey it).
> 
> 
> 
> -Original  Message-
> From: Nguyen, Vincent (CDC/OD/OADS) (CTR) [mailto:v...@cdc.gov]
> Sent: Mon 04/10/2010  16:34
> To: solr-user@lucene.apache.org
> Subject:  RE: multi level faceting
> 
> Ok.  Thanks for the quick  response.
> 
> Vincent Vu Nguyen
> Division of Science Quality and  Translation
> Office of the Associate Director for Science
> Centers for  Disease Control and Prevention (CDC)
> 404-498-6154
> Century Bldg  2400
> Atlanta, GA 30329 
> 
> 
> -Original Message-
> From:  Allistair Crossley [mailto:a...@roxxor.co.uk] 
> Sent: Monday, October  04, 2010 9:40 AM
> To: solr-user@lucene.apache.org
> Subject:  Re: multi level faceting
> 
> I think that is just sending 2 fq facet queries  through. In Solr PHP I
> would do that with, e.g.
> 
> $params['facet'] =  true;
> $params['facet.fields'] = array('Size');
> $params['fq'] =>  array('sex' => array('Men', 'Women'));
> 
> but yes i think you'd have to  send through what the current facet query
> is and add it to your next  drill-down
> 
> On Oct 4, 2010, at 9:36 AM, Nguyen, Vincent (CDC/OD/OADS)  (CTR) wrote:
> 
> > Hi,
> > 
> > 
> > 
> > I was wondering  if there's a way to display facet options based on
> > previous facet  values.  For example, I've seen many shopping sites
> where
> > a user  can facet by "Mens" or "Womens" apparel, then be shown "sizes"
> to
> >  facet by (for Men or Women only - whichever they chose).  
> > 
> > 
> > 
> > Is this something that would have to be handled at the  application
> > level?
> > 
> > 
> > 
> > Vincent Vu  Nguyen
> > 
> > 
> > 
> > 
> > 
> 
> 
> 
> 
> If you  wish to view the St. James's Place email disclaimer, please use the 
>link  below
> 
> http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer
> 


Differences between FilterFactory and TokenizerFactory?

2010-10-04 Thread Andy
There are EdgeNGramFilterFactory & EdgeNGramTokenizerFactory.

Likewise there are StandardFilterFactory & StandardTokenizerFactory.

LowerCaseFilterFactory & LowerCaseTokenizerFactory.

Seems like they always come in pairs. 

What are the differences between FilterFactory and TokenizerFactory? When 
should I use one as opposed to the other?

Thanks



  


ant build problem

2010-10-04 Thread satya swaroop
Hi all,
i updated my solr trunk to revision 1004527. when i go for compiling
the trunk with ant i get so many warnings, but the build is successful. the
warnings are here:::
common.compile-core:
[mkdir] Created dir:
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] Compiling 475 source files to
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] warning: [path] bad path element
"/usr/share/ant/lib/hamcrest-core.jar": no such file or directory
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:455:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:705:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:812:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:983:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/search/FieldCacheImpl.java:209:
warning: [unchecked] unchecked cast
[javac] found   : java.lang.Object
[javac] required: T
[javac] key.creator.validate( (T)value, reader);
[javac]  ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/search/FieldCacheImpl.java:278:
warning: [unchecked] unchecked call to
Entry(java.lang.String,org.apache.lucene.search.cache.EntryCreator) as a
member of the raw type org.apache.lucene.search.FieldCacheImpl.Entry
[javac] return (ByteValues)caches.get(Byte.TYPE).get(reader, new
Entry(field, creator));
ptionList.addAll(exceptions);

||

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files additionally use unchecked or unsafe
operations.
[javac] 100 warnings

BUILD SUCCESSFUL
Total time: 19 seconds


here i placed only the starting stage of warnings.
After the compiling i thought to check with the ant test and performed but
it is failed..

i didnt find any hamcrest-core.jar in my ant library
i use ant 1.7.1


Regards,
satya


Solr admin level configurations for production

2010-10-04 Thread Siva Prasad Janapati
 Hi,

We configured solr search for huge data(~6 million records).For increasing the 
seach performance, is there any admin level configurations do we need to 
do.Please suggest some of the admin level settings.

Regards,
Siva


Numeric search in text field

2010-10-04 Thread javaxmlsoapdev

Hello,

I have a string "Marsh 1" (no quotes while searching). If I put "Marsh 1" in
the search box with no quotes I get expected results back but when I search
for just "1" (again no quotes) I don't get any results back. I use
WorldDelimiterFactory as follow. Any idea?


  
  
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Numeric-search-in-text-field-tp1633741p1633741.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr UIMA integration

2010-10-04 Thread maheshkumar

Hi Tommaso,

I have register in the both sites and got the api keys.
But i am getting a new error.

Oct 4, 2010 6:15:04 PM
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
callAnalysisComponentProcess(405)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
at
org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
at
org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:409)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280)
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.executeAE(UIMAUpdateRequestProcessor.java:102)
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processFieldValue(UIMAUpdateRequestProcessor.java:95)
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:59)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by:
org.apache.uima.alchemy.digester.exception.ResultDigestingException:
org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException
: ERROR
at
org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:133)
... 31 more
Caused by:
org.apache.uima.alchemy.annotator.exception.AlchemyCallFailedException:
ERROR
at
org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:129)
... 31 more
Oct 4, 2010 6:15:04 PM
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl
processAndOutputNewCASes(275)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
at
org.apache.uima.alchemy.annotator.AbstractAlchemyAnnotator.process(AbstractAlchemyAnnotator.java:138)
at
org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_im

RE: multi level faceting

2010-10-04 Thread Ephraim Ofir
Take a look at "Mastering the Power of Faceted Search with Chris
Hostetter"
(http://www.lucidimagination.com/solutions/webcasts/faceting).  I think
there's an example of what you're looking for there.

Ephraim Ofir

-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Tuesday, October 05, 2010 5:44 AM
To: solr-user@lucene.apache.org
Subject: Re: multi level faceting

Hi,

I *think* this is not what Vincent was after.  If I read the suggestions

correctly, you are saying to use &fq=x&fq=y -- multiple fqs.
But I think Vincent is wondering how to end up with something that will
let him 
create a UI with multi-level facets (with a single request), e.g.

Footwear (100)
  Sneakers (20)
Men (1)
Women (19)

  Dancing shoes (10)
Men (0)
Women (10)
...

If this is what Vincent was after, I'd love to hear suggestions myself.
:)

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Jason Brown 
> To: solr-user@lucene.apache.org
> Sent: Mon, October 4, 2010 11:34:56 AM
> Subject: RE: multi level faceting
> 
> Yes, by adding fq back into the main query you will get results
increasingly  
>filtered each time.
> 
> You may run into an issue if you are displaying facet  counts, as the
facet 
>part of the query will also obey the increasingly filtered  fq, and so
not 
>display counts for other categories anymore from the chosen facet
(depends if 
>you need to display counts from a facet once the first value from  the
facet has 
>been chosen if you get my drift). Local params are a way to deal  with
this by 
>not subjecting the facet count to the same fq restriction (but
allowing the 
>search results to obey it).
> 
> 
> 
> -Original  Message-
> From: Nguyen, Vincent (CDC/OD/OADS) (CTR) [mailto:v...@cdc.gov]
> Sent: Mon 04/10/2010  16:34
> To: solr-user@lucene.apache.org
> Subject:  RE: multi level faceting
> 
> Ok.  Thanks for the quick  response.
> 
> Vincent Vu Nguyen
> Division of Science Quality and  Translation
> Office of the Associate Director for Science
> Centers for  Disease Control and Prevention (CDC)
> 404-498-6154
> Century Bldg  2400
> Atlanta, GA 30329 
> 
> 
> -Original Message-
> From:  Allistair Crossley [mailto:a...@roxxor.co.uk] 
> Sent: Monday, October  04, 2010 9:40 AM
> To: solr-user@lucene.apache.org
> Subject:  Re: multi level faceting
> 
> I think that is just sending 2 fq facet queries  through. In Solr PHP
I
> would do that with, e.g.
> 
> $params['facet'] =  true;
> $params['facet.fields'] = array('Size');
> $params['fq'] =>  array('sex' => array('Men', 'Women'));
> 
> but yes i think you'd have to  send through what the current facet
query
> is and add it to your next  drill-down
> 
> On Oct 4, 2010, at 9:36 AM, Nguyen, Vincent (CDC/OD/OADS)  (CTR)
wrote:
> 
> > Hi,
> > 
> > 
> > 
> > I was wondering  if there's a way to display facet options based on
> > previous facet  values.  For example, I've seen many shopping sites
> where
> > a user  can facet by "Mens" or "Womens" apparel, then be shown
"sizes"
> to
> >  facet by (for Men or Women only - whichever they chose).  
> > 
> > 
> > 
> > Is this something that would have to be handled at the  application
> > level?
> > 
> > 
> > 
> > Vincent Vu  Nguyen
> > 
> > 
> > 
> > 
> > 
> 
> 
> 
> 
> If you  wish to view the St. James's Place email disclaimer, please
use the 
>link  below
> 
> http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer
> 


Tuning Solr

2010-10-04 Thread Floyd Wu
Hi there,

If I dont need Morelikethis, spellcheck, highlight.
Can I remove this configuration section in solrconfig.xml?
In other workd, does solr load and use these SearchComponet on statup and
suring runtime?

Remove this configuration will or will not speedup query?

Thanks


Begins with and ends with word

2010-10-04 Thread Maddy.Jsh

Hi,

I have 2 documents with following values.
Doc1
Subject: Weekly transport

Doc2
Subject: Week report on transportation

I need to search documents in 4 formats

1. Begins with “week”
 It should return documents which has "week" as first word, i.e. doc1

2. Begins with “week*”
 It should return documents which has "week" or its derivatives(weekly,
weeks) as first word, i.e. doc1 and doc2

3. Ends with “transport”
 It should return documents which end with word "transport", i.e. doc1

4. Ends with “transport*”
 It should return documents which end with word "transport" or its
derivatives(transportation, transporter, etc), i.e. doc1 and doc2 


Please let me know if there are any solutions.

Thanks.






-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Begins-with-and-ends-with-word-tp1634376p1634376.html
Sent from the Solr - User mailing list archive at Nabble.com.