Re: Stop Words in SpellCheckComponent

2012-06-01 Thread Jack Krupansky
Your earlier email had this option in your spellcheck.de field type analyzer 
for the StopFilterFactory:


words=german_stop_long.txt

But your most recent email referred to stopword.txt.

So, either add the to german_stop_long.txt, or change the words option 
of your stopfilter to refer to stopwords.txt.


BTW, I think you can actually have a comma-separated list of stopword files, 
so you can write:


words=german_stop_long.txt,stopwords.txt

-- Jack Krupansky

-Original Message- 
From: Matthias Müller

Sent: Friday, June 01, 2012 1:44 AM
To: solr-user@lucene.apache.org
Subject: Re: Stop Words in SpellCheckComponent


str name=fieldspellcheck_de/str

That should reference a field, not a field type.


Thanks for your help. But I did that, too.

Here I'll show that even the solr example webapp makes suggestions for
stopwords: I've ...

1. added the to the stopwords.txt
2. added thex to an example document (field name)
3. startet solr
4. indexed the example files (sh post.sh *.xml)
5. searched for the solr
http://myhost:8983/solr/select?q=the+solrspellcheck=truewt=json
6. got the desired result, but also the wrong suggestion thex

{ response : { docs : [ {...  name : Solr, thex Enterprise
Search Server, ..  } ],
 numFound : 1,
...  },
...
 spellcheck : { suggestions : [ the,
 {...suggestion : [ thex ]  }
   ] }
}


Here's the complete diff between the original download and my 3 
modifications:


diff -r apache-solr-3.6.0/example/exampledocs/solr.xml
apache-solr-3.6.0x/example/exampledocs/solr.xml
21c21
   field name=nameSolr, the Enterprise Search Server/field
---

  field name=nameSolr, thex Enterprise Search Server/field

diff -r apache-solr-3.6.0/example/solr/conf/solrconfig.xml
apache-solr-3.6.0x/example/solr/conf/solrconfig.xml
781a782,785

 arr name=last-components
   strspellcheck/str
 /arr


1122a1127

  str name=buildOnCommittrue/str

diff -r apache-solr-3.6.0/example/solr/conf/stopwords.txt
apache-solr-3.6.0x/example/solr/conf/stopwords.txt
14a15,16


the 




Re: How can I remove the home page priority of site home page from search results

2012-06-01 Thread Jack Krupansky
Add debugQuery=true to your query and check how the home page is scored. 
That should give you a clue why the title is not boosting the score enough. 
Maybe you simply need a higher boost for title, but let the debugQuery 
scoring be your guide.


Actually, if you are explicitly referencing a field in your query 
(title:abc), that won't pick up the title boost from the qf field list. 
You would need an explicit boost in the query itself.


But, I'm not sure I understand how your  query gets expanded: 
q=title:'.$keywords.'


Maybe you wanted: q=title:(.$keywords.), because otherwise spaces between 
the keywords would end the first fielded term and then proceed to 
reference the dismax field list (qf).


-- Jack Krupansky

-Original Message- 
From: Shameema Umer

Sent: Friday, June 01, 2012 1:46 AM
To: solr-user@lucene.apache.org
Subject: How can I remove the home page priority of site home page from 
search results


My query is like this:

?q=title:'.$keywords.'defType=edismaxqf=title^10 url^9
content^5start=0rows=10version=2.2indent=onhl=truehl.fl=contenthl.fragsize=300

My results show site home page as the first result even though there are
other pages with title scoring more for the given keywords.

I need to give less priority to site home page than other pages. Please
help.

Thanks
Shameema 



Re: Multi-words synonyms matching

2012-06-01 Thread Bernd Fehling

Are you sure with LUCENE_33 (Use of BitVector)?


Am 31.05.2012 17:20, schrieb O. Klein:
 I have been struggling with this as well and found that using LUCENE_33 gives
 the best results.
 
 But as it will be deprecated this is no everlasting solution. May somebody
 knows one?
 


Re: How can I remove the home page priority of site home page from search results

2012-06-01 Thread Shameema Umer
I added braces to key words and debuged:
i really need to boost term frequency. Please help.

1.2125369 = (MATCH) fieldWeight(title:gold in 102), product of:  1.0 =
tf(termFreq(title:gold)=1)  4.8501477 = idf(docFreq=11,maxDocs=564)  0.25 =
fieldNorm(field=title, doc=102)

0.5304849 = (MATCH) fieldWeight(title:gold in 422), product of:  1.0 =
tf(termFreq(title:gold)=1)
  4.8501477 = idf(docFreq=11, maxDocs=564)  0.109375 =
fieldNorm(field=title, doc=422)

0.45470136 = (MATCH) fieldWeight(title:gold in 105), product of:
  2.0 = tf(termFreq(title:gold)=4)
  4.8501477 = idf(docFreq=11, maxDocs=564)
  0.046875 = fieldNorm(field=title, doc=105)


Re: Stop Words in SpellCheckComponent

2012-06-01 Thread Matthias Müller
 But your most recent email referred to stopword.txt.

 So, either add the to german_stop_long.txt, or change the words option
 of your stopfilter to refer to stopwords.txt.

Sorry for that confusion: The stopfilter refers to the stopwords.txt

Now I'm just talking about the solr example webapp
(apache-solr-3.6.0.tgz/example) which I slightly modified (as
described in the last mail).

In this example solr makes also suggestions for stopwords.
I can't see a mistake in my configuration.

1. The stopfilter refers to the stopwords.txt:

fieldType name=text_general class=solr.TextField
positionIncrementGap=100
  analyzer type=index
  ...
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
  ...
  /analyzer
  analyzer type=query
  ...
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
...
  /analyzer
/fieldType

2. The SpellCheckComponent refers to the field name:

 str name=fieldname/str


Re: Hightlighting and excerpt

2012-06-01 Thread Shameema Umer
Hi Togla to get excerpt you should add the fragment size parameter.
hl.fragsize=300
eg:
hl=truehl.fl=contenthl.fragsize=300

Thanks
Shameema

On Thu, May 31, 2012 at 7:31 PM, Ahmet Arslan iori...@yahoo.com wrote:

  I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB
  was stressed?

 Hi Tolga,

 I think, you can easily learn the basic using one of the following books.
 http://lucene.apache.org/solr/books.html




Re: Cannot get highlighting to work

2012-06-01 Thread Asfand Qazi

On 31/05/12 21:10, Jack Krupansky wrote:

Try a query that uses a term that doesn't split an alphanumeric term
into two terms.

Then check to see what field type you used for the symbol and
marker_symbol fields and whether the analyzer for that field type has
changed in 3.6.



Aha - yes, not using number fields makes the highlighter work.  The 
analyzer had been changed by another dev (helpfully) for the fields I 
was trying to highlight to solr.KeywordTokenizerFactory - I changed it 
back to solr.WhitespaceTokenizerFactory, as it was in the 1.4 config.


With a lot of hope I tried to fire the same query, but the exact same 
thing happened - the highlighting for a document is an empty document 
(i.e. { } ) just like before.


Any other clues?

Thanks







-- Jack Krupansky
-Original Message- From: Asfand Qazi
Sent: Thursday, May 31, 2012 12:32 PM
To: solr-user@lucene.apache.org
Subject: Cannot get highlighting to work

Hello,

I am having problems doing highlighting a Solr 3.6 instance, while it
was working just fine before on our 1.4 instance.

The solrconfig.xml and schema.xml files are located here:

https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml


(please note the incorrect line wrapping - it should be on one line)


https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/solrconfig.xml


(please note the incorrect line wrapping - it should be on one line)


The query I fire off (which worked on the 1.4 instance) is:

/solr/main/select?q=Cbx1wt=jsonhl=truehl.fl=*hl.usePhraseHighlighter=true


(please note the incorrect line wrapping - it should be on one line)

I expect a section like:
{
MGI:105369: {
symbol: [
emCbx/emem1/em
],
marker_symbol: [
emCbx/emem1/em
]
}
}


I get:
{
MGI:105369: { }
}


Can anyone help?

Thanks





--
Regards,
  Asfand Yar Qazi
  Team 87 - High Throughput Gene Targeting
  Wellcome Trust Sanger Institute


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 


RE: per-fieldtype similarity not working

2012-06-01 Thread Markus Jelsma
Thanks but i am clearly missing something? We declare the similarity in the 
fieldType just as in the example and looking at the example again i don't see 
how it's being done differently. What am i missnig and where do i miss it? :)

-Original message-
 From:Robert Muir rcm...@gmail.com
 Sent: Thu 31-May-2012 17:47
 To: solr-user@lucene.apache.org
 Subject: Re: per-fieldtype similarity not working
 
 On Thu, May 31, 2012 at 11:23 AM, Markus Jelsma
 markus.jel...@openindex.io wrote:
 
  We simply declare the following in our fieldType:
  similarity class=FQCN/
 
 
 Thats not enough, see the example:
 http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/schema-sim.xml
 
 
 -- 
 lucidimagination.com
 


Facing problem in SOLR replication

2012-06-01 Thread Krishn Murari Mishra
I get following errors printed on my slave console many times when I go for 
Master - Slave replication option

May 31, 2012 11:50:44 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry
INFO: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) 
caught when processing request: The server 172.16.9.98 failed to respond
May 31, 2012 11:50:44 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry

The replication never goes 100% successful.

I am using single Master and two Slave nodes. Slave nodes poles Master after 
every 10 minutes.
There are 256 stores in each node containing more than 10 million data.

Let me know how to resolve this problem.


Krishn Murari Mishra
Sr. Software Engineer
[Description: Description: brickred_newlogo]
P: +91.120.400.7100 ext. 435 | F: +91.120.432.4560 | M: +91.78380.02318
TPG Software Pvt. Ltd.  | B-25, Sector-58, Noida (U.P), INDIA- 201301
Website: www.threepillarglobal.comhttp://www.threepillarglobal.com/; 
www.brickred.comhttp://www.brickred.com/



Re: Strip html

2012-06-01 Thread Tigunn
Excuse me, 
i explain my need:
i have a xml file like exemple:
I want to indexing the xsl transformation; i transform my xml to html, i
have:
-
si les ruches d’abeilles prouvent la
  monarchie, les fourmillières, les troupes d’éléphants ou
de castors prouvent la république.
-
i indexed this one, with the type text_strip_html, but it's not result i
want.

I want: if i search castors solr return this xml file (with the exemple:
castors). I tryed to strip_tags() (php function) before index again. But it
doesn't work.

i want to put in index not :castors or c astors or again astors but
castors.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strip-html-tp3987051p3987232.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr commit execute taking time to execute

2012-06-01 Thread Sri Krishna
The data i use to update is very less, infact at max 2 to 3 words. Any
suggestions for improvement ? . The requirement is that updated index need
to be useful there an then, so lazy commit or auto commit are not useful
here 

here is the log info 

Jun 1, 2012 3:52:39 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2

commit{dir=/home/krishna/Desktop/apache-solr-3.6.0/example/multicore/core1/data/index,segFN=segments_2f,version=1338546106195,generation=87,filenames=[_2i.tii,
_2i.fdt, _2i.nrm, _2m.fnm, _2l.frq, _2l.fdt, _2j.fnm, _2l.fdx, _2k.tis,
_2l.fnm, _2j.frq, _2k.tvx, _2j.nrm, _2l.tii, _2k.tvf, _2m.tii, _2l.prx,
_2k.tvd, _2j.tii, _2i.fdx, _2m.tis, _2l.nrm, _2l.tvd, _2i.fnm, _2l.tvf,
_2j.tis, _2k.nrm, _2i.tvx, _2m.tvd, _2m.prx, _2j.fdt, _2j.prx, _2m.tvf,
_2m.nrm, _2i.tvf, _2i.tvd, _2l.tis, _2k.fnm, _2i.prx, _2m.tvx, _2m.fdx,
_2k.tii, _2m.fdt, _2j.fdx, _2j.tvx, _2l.tvx, segments_2f, _2k.fdx, _2k.prx,
_2k.fdt, _2j.tvd, _2k.frq, _2m.frq, _2j.tvf, _2i.tis, _2i.frq]

commit{dir=/home/krishna/Desktop/apache-solr-3.6.0/example/multicore/core1/data/index,segFN=segments_2g,version=1338546106197,generation=88,filenames=[_2n.nrm,
_2i.tii, _2i.fdt, _2n.tis, _2i.nrm, _2m.fnm, _2n.fdx, _2n.tii, _2n.fdt,
_2l.frq, _2l.fdt, _2j.fnm, _2l.fdx, _2k.tis, _2l.fnm, _2j.frq, _2k.tvx,
_2j.nrm, _2l.tii, _2k.tvf, _2m.tii, _2l.prx, _2k.tvd, _2j.tii, _2i.fdx,
_2m.tis, _2l.nrm, _2l.tvd, _2i.fnm, _2l.tvf, _2n.tvd, _2j.tis, _2k.nrm,
_2i.tvx, _2m.tvd, _2m.prx, _2n.prx, _2j.fdt, _2j.prx, _2m.tvf, _2m.nrm,
_2n.tvf, _2i.tvf, _2i.tvd, _2l.tis, _2k.fnm, _2i.prx, _2n.frq, _2m.tvx,
_2m.fdx, segments_2g, _2n.fnm, _2n.tvx, _2k.tii, _2m.fdt, _2j.fdx, _2j.tvx,
_2l.tvx, _2k.fdx, _2k.prx, _2k.fdt, _2j.tvd, _2k.frq, _2m.frq, _2j.tvf,
_2i.tis, _2i.frq]
Jun 1, 2012 3:52:39 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1338546106197
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher init
INFO: Opening Searcher@4662de7b main
Jun 1, 2012 3:52:39 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Jun 1, 2012 3:52:39 PM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {add=[(null)],commit=} 0 413
Jun 1, 2012 3:52:39 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr path=/update
params={waitSearcher=falsecommit=true} status=0 *QTime=413 *
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@4662de7b main from Searcher@65919e73 main

fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@4662de7b main

fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@4662de7b main from Searcher@65919e73 main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@4662de7b main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@4662de7b main from Searcher@65919e73 main

queryResultCache{lookups=2,hits=0,hitratio=0.00,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=179,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=179,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@4662de7b main

queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=179,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=179,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@4662de7b main from Searcher@65919e73 main

documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=1,warmupTime=0,cumulative_lookups=1154,cumulative_hits=593,cumulative_hitratio=0.51,cumulative_inserts=561,cumulative_evictions=0}
Jun 1, 2012 3:52:39 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@4662de7b main


Re: How to find the age of a page

2012-06-01 Thread in.abdul
Shameema Umer,

you can add another one new field in schema ..  while updating or indexing
add the time stamp to that current field ..

Thanks and Regards,
S SYED ABDUL KATHER



On Fri, Jun 1, 2012 at 3:44 PM, Shameema Umer [via Lucene] 
ml-node+s472066n3987234...@n3.nabble.com wrote:

 Hi all,

 How can i find the age of a page solr results? that is the last updated
 time.
 tstamp refers to the fetch time, not the exact updated time, right?


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/How-to-find-the-age-of-a-page-tp3987234.html
  To unsubscribe from Lucene, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472066code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



-
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-the-age-of-a-page-tp3987234p3987238.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multi-words synonyms matching

2012-06-01 Thread O. Klein
Looking for some more background information I stumbled upon
https://issues.apache.org/jira/browse/LUCENE-3668. If you read the last post
it confirms my issue. So maybe this is a bug?



Bernd Fehling-2 wrote
 
 Are you sure with LUCENE_33 (Use of BitVector)?
 
 
 Am 31.05.2012 17:20, schrieb O. Klein:
 I have been struggling with this as well and found that using LUCENE_33
 gives
 the best results.
 
 But as it will be deprecated this is no everlasting solution. May
 somebody
 knows one?

 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multi-words-synonyms-matching-tp3898950p3987241.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strip html

2012-06-01 Thread Jack Krupansky
I tryed to strip_tags() (php function) before index again. But it doesn't 
work.


What does it not do correctly? Show us. Show an actual document as posted to 
Solr.


As Hoss said, if you are stripping HTML before posting the document to Solr, 
then you want a field type that doesn't use the strip HTML filter. And you 
probably want the French light stemmer to allow search on castor to match 
castors.


Show us the schema with field types and an actual input document that you 
post to Solr.


Unfortunately, we may still be confused about what exact operations you are 
performing and the exact order in which you are performing the operations.


You mentioned PHP, but haven't said exactly how you are using it. Is PHP 
sending the document directly to Solr? If so, we need to know what PHP is 
sending.


-- Jack Krupansky

-Original Message- 
From: Tigunn

Sent: Friday, June 01, 2012 6:00 AM
To: solr-user@lucene.apache.org
Subject: Re: Strip html

Excuse me,
i explain my need:
i have a xml file like exemple:
I want to indexing the xsl transformation; i transform my xml to html, i
have:
-
si les ruches d’abeilles prouvent la
 monarchie, les fourmillières, les troupes d’éléphants ou
de castors prouvent la république.
-
i indexed this one, with the type text_strip_html, but it's not result i
want.

I want: if i search castors solr return this xml file (with the exemple:
castors). I tryed to strip_tags() (php function) before index again. But it
doesn't work.

i want to put in index not :castors or c astors or again astors but
castors.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strip-html-tp3987051p3987232.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Stop Words in SpellCheckComponent

2012-06-01 Thread Jack Krupansky
You forgot to give us the field definition for name. Is it the same as in 
the 3.6 example, or is it changed?


Make sure that you delete all existing data after you change the 
schema/config.


Do a direct query on the spellcheck field (name:the) to verify whether the 
is being indexed or not.


Also, generally, you should have a separate field and field type for the 
spellcheck field so that normal text fields can use stop words.


-- Jack Krupansky

-Original Message- 
From: Matthias Müller

Sent: Friday, June 01, 2012 4:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Stop Words in SpellCheckComponent


But your most recent email referred to stopword.txt.

So, either add the to german_stop_long.txt, or change the words option
of your stopfilter to refer to stopwords.txt.


Sorry for that confusion: The stopfilter refers to the stopwords.txt

Now I'm just talking about the solr example webapp
(apache-solr-3.6.0.tgz/example) which I slightly modified (as
described in the last mail).

In this example solr makes also suggestions for stopwords.
I can't see a mistake in my configuration.

1. The stopfilter refers to the stopwords.txt:

   fieldType name=text_general class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 ...
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
 ...
 /analyzer
 analyzer type=query
 ...
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
...
 /analyzer
   /fieldType

2. The SpellCheckComponent refers to the field name:

str name=fieldname/str 



Re: Cannot get highlighting to work

2012-06-01 Thread Jack Krupansky
I got confused in the last paragraph - does a purely alphabetic term get 
highlighted properly or not? I am trying to figure out if the problem 
relates only to terms that decompose into phrases (as alphanumeric terms do) 
or for all terms. Thanks.


If the analyzer changes, the data must be reindexed.

-- Jack Krupansky

-Original Message- 
From: Asfand Qazi

Sent: Friday, June 01, 2012 5:08 AM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get highlighting to work

On 31/05/12 21:10, Jack Krupansky wrote:

Try a query that uses a term that doesn't split an alphanumeric term
into two terms.

Then check to see what field type you used for the symbol and
marker_symbol fields and whether the analyzer for that field type has
changed in 3.6.



Aha - yes, not using number fields makes the highlighter work.  The
analyzer had been changed by another dev (helpfully) for the fields I
was trying to highlight to solr.KeywordTokenizerFactory - I changed it
back to solr.WhitespaceTokenizerFactory, as it was in the 1.4 config.

With a lot of hope I tried to fire the same query, but the exact same
thing happened - the highlighting for a document is an empty document
(i.e. { } ) just like before.

Any other clues?

Thanks







-- Jack Krupansky
-Original Message- From: Asfand Qazi
Sent: Thursday, May 31, 2012 12:32 PM
To: solr-user@lucene.apache.org
Subject: Cannot get highlighting to work

Hello,

I am having problems doing highlighting a Solr 3.6 instance, while it
was working just fine before on our 1.4 instance.

The solrconfig.xml and schema.xml files are located here:

https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml


(please note the incorrect line wrapping - it should be on one line)


https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/solrconfig.xml


(please note the incorrect line wrapping - it should be on one line)


The query I fire off (which worked on the 1.4 instance) is:

/solr/main/select?q=Cbx1wt=jsonhl=truehl.fl=*hl.usePhraseHighlighter=true


(please note the incorrect line wrapping - it should be on one line)

I expect a section like:
{
MGI:105369: {
symbol: [
emCbx/emem1/em
],
marker_symbol: [
emCbx/emem1/em
]
}
}


I get:
{
MGI:105369: { }
}


Can anyone help?

Thanks





--
Regards,
  Asfand Yar Qazi
  Team 87 - High Throughput Gene Targeting
  Wellcome Trust Sanger Institute


--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE. 



Sharing common config between different search handlers

2012-06-01 Thread Jochen Just
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey there list,

my application needs two searching modes. The first modes needs to
consider synonyms the other one must not.

Currently I have two field types with basically the same configuration
except that one field type uses a SynonymFilter the other one doesn't.
Furthermore there two Searchhandlers. The first one uses only fields
that recognises synonyms the other one uses only fields that doesn't
recognise synonyms.
So far, so good.

But I would like those two Searchhandlers to share the rest of their
configuration. Because if anything needs to be changed, it need to be
done for both Searchhandlers. I think that's kind of ugly.
Additionally I would like the client (the web page the triggers a
search) not to know anything about the field names. Therefore I do not
simply specify a list of query fields in the request.

To clarify my situation, follows a sample solrconfig.xml (I left out
many of the details to save time and space, but I guess you get the idea).

requestHandler name=/search_with_synonyms class=solr.SearchHandler
  lst name=defaults
str name=defTypeedismax/str
str name=rows10/str
str name=flid, score/str
str name=qfdescription_with_synonyms/str
!-- a lot of other configuration happens here --
  /lst
/requestHandler

requestHandler name=/search_without_synonyms
class=solr.SearchHandler
  lst name=defaults
str name=defTypeedismax/str
str name=rows10/str
str name=flid, score/str
str name=qfdescription_without_synonyms/str
!-- a lot of other configuration happens here --
  /lst
/requestHandler

I thought about using only SearchHandler and a filterquery to get rid
of the synonym-results if they are not needed. But I as far as I can
tell, that's not possible.

Thanks in advance and any idea is welcome,

Jochen
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJPyLoZAAoJEP1xbhgWUHmS72oQAKbh5vr5SmROMO9Eupdfnr7c
2VNqQsPPslzVWSIRVsvsEZxW8RalWsXB33Dayqzqg+5Be0BSBnrwmCYGsMwetHCT
PQjXx9OvuxlGxFeRoLmrMbhDdKLudWPkHmcHK2t8ZTgZ7ZWBJOimNiYIJANHd5Wc
7C0SMqcmiyxA58RvVePtkcZP7ag7kq2CLlamd8hg7cDWs4FYR/4eRXZcgIg76Eoc
CnkgP5w0gRWxNbDI1y+KxvT4XX3lZ/w+Kwr5A7CK1WTm1Y+hrDc7cjNiUU4c/3a1
9kgnhUxTKlzLLeIPWe2qvtXqvgUcMg9l09oFnTQ+u58g+v7wBObEnCJi1IKT7gT3
+pA8kAFY8bAHauoeHg2XZO3PFtowMXXm1Er/5+euEeoRdlOAi9SPO5pCbRnlSAI8
u8QwFfXv3ZeYI4CFsQsFFUX/NVPJuVXerti0n3Ebn6sUXqs0EmxUmr5vpSnQ38Md
NQgdFRWeYeRD341Jy1tqyFh8gtzIUwWA5Otd7tKR//xidhrnq5CCA8kOr+i3AnT/
4w04ite1uGd+m5erspcBR6SkxtLVcSp3rcpzSV0CC2j5vQdxe6b8PBy25cowaxJF
wOtrtyPisvwWMM253GMuO4O6uxv+p/SgP1gdiZ4I9ZMQQwlT/Ny7+APEj93eNrKQ
3Y38BDWKUio30/yThe9G
=4mQ9
-END PGP SIGNATURE-


Re: How can I remove the home page priority of site home page from search results

2012-06-01 Thread Jack Krupansky

What are the three documents?

In any case, it looks like the fieldNorm for title is 2.3 times greater 
for the first document compared to the second document and the third 
document has an even smaller fieldNorm for title.


Further, as explain explains, only the title field is bing used. This going 
back to what I said in my last email - only the title field is being used in 
this query.


Maybe the home page URL is being added as the title?

-- Jack Krupansky

-Original Message- 
From: Shameema Umer

Sent: Friday, June 01, 2012 4:41 AM
To: solr-user@lucene.apache.org
Subject: Re: How can I remove the home page priority of site home page from 
search results


I added braces to key words and debuged:
i really need to boost term frequency. Please help.

1.2125369 = (MATCH) fieldWeight(title:gold in 102), product of:  1.0 =
tf(termFreq(title:gold)=1)  4.8501477 = idf(docFreq=11,maxDocs=564)  0.25 =
fieldNorm(field=title, doc=102)

0.5304849 = (MATCH) fieldWeight(title:gold in 422), product of:  1.0 =
tf(termFreq(title:gold)=1)
 4.8501477 = idf(docFreq=11, maxDocs=564)  0.109375 =
fieldNorm(field=title, doc=422)

0.45470136 = (MATCH) fieldWeight(title:gold in 105), product of:
 2.0 = tf(termFreq(title:gold)=4)
 4.8501477 = idf(docFreq=11, maxDocs=564)
 0.046875 = fieldNorm(field=title, doc=105) 



Re: Strip html

2012-06-01 Thread Tigunn
Thanks for your answers. Unfortunately, i can't try before monday.

In first my solr's settings:
In schema.xml:

In my php :
in a loop on all document xml of my database Exist-db (xml database wich
store xml files)


A exemple of a doc xml:


I follow the steps:
1 - i transform xml to html, it's a xsl sheet (not mine, but i can change
xsl sheets to generate a text whitout html: i want to try).
For information xslt1.0 return for the exemple:

You can notice : the word castors is break by html tag 


2 - I want to strip html tags before indexing.
i try in php:  $body_norm = strip_tags($body_norm);
with the actual fieldType define in schema.xml it's wrong.
But i want to try 
What do you think about?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strip-html-tp3987051p3987253.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cannot get highlighting to work

2012-06-01 Thread Asfand Qazi
Ah... on further inspection of the schema, I saw that the field type was 
a custom one that had been configured differently from the standard 
'text' one.  I simply got rid of the custom field type and set it back 
to text.  Then as you said I reindexed the data (another blunder on my 
part before).  Now it works!  Thanks


On 01/06/12 13:43, Jack Krupansky wrote:

I got confused in the last paragraph - does a purely alphabetic term get
highlighted properly or not? I am trying to figure out if the problem
relates only to terms that decompose into phrases (as alphanumeric terms
do) or for all terms. Thanks.

If the analyzer changes, the data must be reindexed.

-- Jack Krupansky

-Original Message- From: Asfand Qazi
Sent: Friday, June 01, 2012 5:08 AM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get highlighting to work

On 31/05/12 21:10, Jack Krupansky wrote:

Try a query that uses a term that doesn't split an alphanumeric term
into two terms.

Then check to see what field type you used for the symbol and
marker_symbol fields and whether the analyzer for that field type has
changed in 3.6.



Aha - yes, not using number fields makes the highlighter work. The
analyzer had been changed by another dev (helpfully) for the fields I
was trying to highlight to solr.KeywordTokenizerFactory - I changed it
back to solr.WhitespaceTokenizerFactory, as it was in the 1.4 config.

With a lot of hope I tried to fire the same query, but the exact same
thing happened - the highlighting for a document is an empty document
(i.e. { } ) just like before.

Any other clues?

Thanks







-- Jack Krupansky
-Original Message- From: Asfand Qazi
Sent: Thursday, May 31, 2012 12:32 PM
To: solr-user@lucene.apache.org
Subject: Cannot get highlighting to work

Hello,

I am having problems doing highlighting a Solr 3.6 instance, while it
was working just fine before on our 1.4 instance.

The solrconfig.xml and schema.xml files are located here:

https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml



(please note the incorrect line wrapping - it should be on one line)


https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/solrconfig.xml



(please note the incorrect line wrapping - it should be on one line)


The query I fire off (which worked on the 1.4 instance) is:

/solr/main/select?q=Cbx1wt=jsonhl=truehl.fl=*hl.usePhraseHighlighter=true



(please note the incorrect line wrapping - it should be on one line)

I expect a section like:
{
MGI:105369: {
symbol: [
emCbx/emem1/em
],
marker_symbol: [
emCbx/emem1/em
]
}
}


I get:
{
MGI:105369: { }
}


Can anyone help?

Thanks








--
Regards,
  Asfand Yar Qazi
  Team 87 - High Throughput Gene Targeting
  Wellcome Trust Sanger Institute


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 


Re: how to show DIH query sql in log file

2012-06-01 Thread Rahul Warawdekar
Hi,

Turn the Solr logging level to FINE for the DIH packages/classes and they
will show up in the log.
http://hostname:port/solr/core/admin/logging

On Fri, Jun 1, 2012 at 9:34 AM, wangjing ppm10...@gmail.com wrote:

 how to show DIH query's sql in log file for troubleshooting?

 thanks.




-- 
Thanks and Regards
Rahul A. Warawdekar


eliminate adminPath tag from solr.xml file?

2012-06-01 Thread geeky2
hello all,

referring to:

http://wiki.apache.org/solr/CoreAdmin#Core_Administration

if you wanted to eliminate administration of the core from the web site,

could you eliminate either solr.xml or remove the 

cores adminPath=/admin/cores from the solr.xml file?

thank you,


--
View this message in context: 
http://lucene.472066.n3.nabble.com/eliminate-adminPath-tag-from-solr-xml-file-tp3987262.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: possible status codes from solr during a (DIH) data import process

2012-06-01 Thread geeky2
thank you ALL for the great feedback - very much appreciated!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987263.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strip html

2012-06-01 Thread Jack Krupansky
The bottom line is that you will need to have your own code that will detect 
the choice tag and map it to the desired choice, and you will have to do 
that before you strip html.


So, given:

   choice
   origC/orig
   regc/reg
   /choiceastors

Your code will have to remove choice.../choice and replace it with the 
element content of the orig or reg element - but not both.


Otherwise, Strip HTML (either in PHP or Solr) will preserver the white 
space between /reg and /choice, which was causing the c to be 
separate from astors.


In short, your PHP code should not use strip_html, but must replace the 
choice.../choice, but do keep the strip HTML in the Solr schema to 
remove the rest of the HTML.


-- Jack Krupansky
-Original Message- 
From: Tigunn

Sent: Friday, June 01, 2012 9:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Strip html

Thanks for your answers. Unfortunately, i can't try before monday.

In first my solr's settings:
In schema.xml:

In my php :
in a loop on all document xml of my database Exist-db (xml database wich
store xml files)


A exemple of a doc xml:


I follow the steps:
1 - i transform xml to html, it's a xsl sheet (not mine, but i can change
xsl sheets to generate a text whitout html: i want to try).
For information xslt1.0 return for the exemple:

You can notice : the word castors is break by html tag


2 - I want to strip html tags before indexing.
i try in php:  $body_norm = strip_tags($body_norm);
with the actual fieldType define in schema.xml it's wrong.
But i want to try
What do you think about?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strip-html-tp3987051p3987253.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: why DIH works in normal mode,error in debug mode

2012-06-01 Thread Dyer, James
I see this in your stacktrace:  java.sql.SQLException: Illegal value for 
setFetchSize().

It must be that your JDBC driver doesn't like the default value (300) that is 
used.  In your datasource tag, try adding a batchSize attribute of either 0 
or -1 (if using -1, DIH automatically changes it to Integer.MIN_VALUE.  
According to the wiki this is to fix this error.)  The value of batchSize is 
used on the java.sql.Statement objects with setFetchSize(batchSize).

example:
dataSource ... batchSize=0 /

see:  http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: wangjing [mailto:ppm10...@gmail.com]
Sent: Friday, June 01, 2012 9:18 AM
To: solr-user@lucene.apache.org
Subject: why DIH works in normal mode,error in debug mode

why DIH works in normal mode,use
http://localhost:8080/apache-solr-3.6.0/cn/admin/
query *:* it can find all

0 35 0 2.2 *:* on 10 测试类型1
台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
有的台灯还有应急功能,用于停电时无电照明已用来应急。 包含灯泡啥的 1 2012-06-01T06:15:58Z 台灯 法国台灯A234 3
100.2 100.2,USD 包含灯泡啥的法国台灯A234台灯
台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
有的台灯还有应急功能,用于停电时无电照明已用来应急。 10.0


BUT ERROR in debug mode,it boring me few days   :_(

 the detail exception statck is

2012-6-1 21:51:51 org.apache.solr.common.SolrException log
: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to execute query: select * from ITEM; Processing Document # 1
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at 
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:205)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to execute query: select * from ITEM; Processing Document # 1
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 23 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: select * from ITEM; Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at 
org.apache.solr.handler.dataimport.DebugLogger$2.getData(DebugLogger.java:188)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at 

RE: Data Import Handler fields with different values in column and name

2012-06-01 Thread Dyer, James
Are you leaving both mappings in there, like this...

entity name=documento query=SELECT iddocumento,nrodocumento,asunto FROM 
documento
 field column=iddocumento name=iddocumento /
 field column=nrodocumento name=nrodocumento /
 field column=asunto name=asunto /
 field column=asunto name=anotherasunto /
/entity

If so, I'm not sure you can map asunto to two different fields like this.  
For that, you may need to write a transformer that will duplicate asunto for 
you.  Although, in most cases all you need to do is add a copyField / in 
schema.xml to copy asunto to anotherasunto.  But a DIH Transformer would be 
helpful, for instance, if asunto is multi-valued but you only want to copy 
the first value to anotherasunto (perhaps you need to sort on it, which is 
not possible with multi-valued fields).

If this doesn't help, let us know exactly why you need to duplicate asunto 
and maybe you can get more help from there.

(If you're not trying to duplicate asunto and you're sure you've taken the 
duplicate out of data-config.xml, then go ahead and double-check spelling and 
case in all your config files.  Besides a typo somewhere, I'm not sure what 
else would cause this not to map.)

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rafael Taboada [mailto:kaliman.fore...@gmail.com] 
Sent: Thursday, May 31, 2012 4:13 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Data Import Handler fields with different values in column and 
name

Please,

Can anyone guide me through this issue? Thanks



-- Forwarded message --
From: Rafael Taboada kaliman.fore...@gmail.com
Date: Thu, May 31, 2012 at 12:30 PM
Subject: Data Import Handler fields with different values in column and name
To: solr-user@lucene.apache.org


Hi folks,

I'm using Solr 3.6 and I'm trying to import data from my database to solr
using Data Import Handler. My db-config is like this:

dataConfig
   dataSource driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@localhost:1521:XE user=admin password=admin /
   document
  entity name=documento query=SELECT
iddocumento,nrodocumento,asunto FROM documento
 field column=iddocumento name=iddocumento /
 field column=nrodocumento name=nrodocumento /
 field column=asunto name=asunto /
  /entity
   /document
/dataConfig

My problem is when I'm trying to use a different values in the field tag,
for example

 field column=asunto name=anotherasunto /

When I use different name from column, this field is omitted. Please can
you help me with this issue?

My schema.xml is:

types
  fieldtype name=string class=solr.StrField sortMissingLast=true
/
   /types

   fields
  !-- general --
  field name=iddocumento type=string indexed=true stored=true
required=true /
  field name=nrodocumento type=string indexed=true stored=true
/
  field name=anotherasunto type=string indexed=true
stored=true /
   /fields

Thanks in advance!

-- 
Rafael Taboada






-- 
Rafael Taboada

/*
 * Phone  992 741 026
 */


Re: Strip html

2012-06-01 Thread Tigunn
the xslt do that 
Jack Krupansky-2 wrote
 
 The bottom line is that you will need to have your own code that will
 detect 
 the choice tag and map it to the desired choice, and you will have to do 
 that before you strip html.
 
 So, given:
 
 choice
 origC/orig
 regc/reg
 /choiceastors
 
 Your code will have to remove choice.../choice and replace it with
 the 
 element content of the orig or reg element - but not both.
 
The xsl sheet remove the choice.../choice and replace it with the reg
element content.


Otherwise, Strip HTML (either in PHP or Solr) will preserver the white
 space between /reg and /choice, which was causing the c to be
 separate from astors.  . Definitely true, i forgot.

I will try your council, monday; i'll be back :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strip-html-tp3987051p3987275.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: per-fieldtype similarity not working

2012-06-01 Thread Robert Muir
On Fri, Jun 1, 2012 at 5:13 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Thanks but i am clearly missing something? We declare the similarity in the 
 fieldType just as in the example and looking at the example again i don't see 
 how it's being done differently. What am i missnig and where do i miss it? :)


Hi Markus, checkout the last line at the bottom:
 !-- default similarity, defers to the fieldType --
 similarity class=solr.SchemaSimilarityFactory/

When this is set, it means IndexSearcher/IndexWriter use a
PerFieldSimilarityWrapper that delegates based to the Solr schema
fieldtype.

Note this is just a simple ordinary similarity impl
(http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/search/similarities/SchemaSimilarityFactory.java),
you could also write your own that works differently.

-- 
lucidimagination.com


Re: why DIH works in normal mode,error in debug mode

2012-06-01 Thread wangjing
In my datasource config  file:
dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver
   
url=jdbc:mysql://127.0.0.1:3306/MYSOLR?useUnicode=trueamp;characterEncoding=UTF-8
   user=root password=qwertyuiop batchSize=500 /

i have done it,set batchSize=500



On Fri, Jun 1, 2012 at 10:38 PM, Dyer, James james.d...@ingrambook.com wrote:
 I see this in your stacktrace:  java.sql.SQLException: Illegal value for 
 setFetchSize().

 It must be that your JDBC driver doesn't like the default value (300) that is 
 used.  In your datasource tag, try adding a batchSize attribute of either 0 
 or -1 (if using -1, DIH automatically changes it to Integer.MIN_VALUE.  
 According to the wiki this is to fix this error.)  The value of batchSize 
 is used on the java.sql.Statement objects with setFetchSize(batchSize).

 example:
 dataSource ... batchSize=0 /

 see:  http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: wangjing [mailto:ppm10...@gmail.com]
 Sent: Friday, June 01, 2012 9:18 AM
 To: solr-user@lucene.apache.org
 Subject: why DIH works in normal mode,error in debug mode

 why DIH works in normal mode,use
 http://localhost:8080/apache-solr-3.6.0/cn/admin/
 query *:* it can find all

 0 35 0 2.2 *:* on 10 测试类型1
 台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
 有的台灯还有应急功能,用于停电时无电照明已用来应急。 包含灯泡啥的 1 2012-06-01T06:15:58Z 台灯 法国台灯A234 3
 100.2 100.2,USD 包含灯泡啥的法国台灯A234台灯
 台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
 有的台灯还有应急功能,用于停电时无电照明已用来应急。 10.0


 BUT ERROR in debug mode,it boring me few days   :_(

  the detail exception statck is

 2012-6-1 21:51:51 org.apache.solr.common.SolrException log
 : Full Import failed:java.lang.RuntimeException:
 java.lang.RuntimeException:
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to execute query: select * from ITEM; Processing Document # 1
 at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
 at 
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:205)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
 at 
 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
 at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
 at 
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
 at 
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.RuntimeException:
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to execute query: select * from ITEM; Processing Document # 1
 at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
 at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
 at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
 ... 23 more
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
 Unable to execute query: select * from ITEM; Processing Document # 1
 at 
 org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
 at 
 org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
 at 
 

[job] Looking for local SOLR developers

2012-06-01 Thread Chambeda
Hi all,

My team is looking for developers with SOLR experience for a company in
Minneapolis, MN.  If interested please reply to this posting and I can fill
you in on more details.

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/job-Looking-for-local-SOLR-developers-tp3987278.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: why DIH works in normal mode,error in debug mode

2012-06-01 Thread Dyer, James
Try setting it to 0 or -1.  Or check the Mysql JDBC driver documentation about 
valid values for Statement.setFetchSize()  I think someone else recently 
asked on this same list about problems with the latest Mysql driver and fetch 
sizes, so this driver may be particularly finicky.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: wangjing [mailto:ppm10...@gmail.com]
Sent: Friday, June 01, 2012 10:00 AM
To: solr-user@lucene.apache.org
Subject: Re: why DIH works in normal mode,error in debug mode

In my datasource config  file:
dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver
   
url=jdbc:mysql://127.0.0.1:3306/MYSOLR?useUnicode=trueamp;characterEncoding=UTF-8
   user=root password=qwertyuiop batchSize=500 /

i have done it,set batchSize=500



On Fri, Jun 1, 2012 at 10:38 PM, Dyer, James james.d...@ingrambook.com wrote:
 I see this in your stacktrace:  java.sql.SQLException: Illegal value for 
 setFetchSize().

 It must be that your JDBC driver doesn't like the default value (300) that is 
 used.  In your datasource tag, try adding a batchSize attribute of either 0 
 or -1 (if using -1, DIH automatically changes it to Integer.MIN_VALUE.  
 According to the wiki this is to fix this error.)  The value of batchSize 
 is used on the java.sql.Statement objects with setFetchSize(batchSize).

 example:
 dataSource ... batchSize=0 /

 see:  http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: wangjing [mailto:ppm10...@gmail.com]
 Sent: Friday, June 01, 2012 9:18 AM
 To: solr-user@lucene.apache.org
 Subject: why DIH works in normal mode,error in debug mode

 why DIH works in normal mode,use
 http://localhost:8080/apache-solr-3.6.0/cn/admin/
 query *:* it can find all

 0 35 0 2.2 *:* on 10 测试类型1
 台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
 有的台灯还有应急功能,用于停电时无电照明已用来应急。 包含灯泡啥的 1 2012-06-01T06:15:58Z 台灯 法国台灯A234 3
 100.2 100.2,USD 包含灯泡啥的法国台灯A234台灯
 台灯是人们生活中用来照明的一种家用电器。它一般分为两种,一种是立柱式的,一种是有夹子的。它的工作原理主要是把灯光集中在一小块区域内,便于工作和学习。一般台灯用的灯泡是白炽灯或者节能灯泡。
 有的台灯还有应急功能,用于停电时无电照明已用来应急。 10.0


 BUT ERROR in debug mode,it boring me few days   :_(

  the detail exception statck is

 2012-6-1 21:51:51 org.apache.solr.common.SolrException log
 : Full Import failed:java.lang.RuntimeException:
 java.lang.RuntimeException:
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to execute query: select * from ITEM; Processing Document # 1
 at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
 at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
 at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
 at 
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:205)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
 at 
 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
 at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
 at 
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
 at 
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.RuntimeException:
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to execute query: select * from ITEM; Processing Document # 1
 at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
 at 
 

Sorting with customized function of score

2012-06-01 Thread Toan V Luu
Hi,
When i use sort=score asc then it works, but when I use a customized
function like sort=sum(score,2) asc then I got an error can not sort on
multivalued field: sum(score,2). Do you know why and how to solve it?
Thanks
Toan.


Search request on Solr Cloud

2012-06-01 Thread Trym R. Møller

Hi

I would like to execute the following query on Solr trunk (cloud):
http://localhost:8983/solr/select?collection=myCollectionq=*%3A*start=0rows=10wt=xml 
http://localhost:8983/solr/x/select?collection=edr_sms_2011_05q=*%3A*start=0rows=10wt=xml


but it fails with a http 404 error.

1. Looking into SolrDispatchFilter#doFilter it seems like the query 
needs either a shard name, a collection name between solr and 
/select elements in the path or a defaultCoreName. Is this correct?


2. Looking into solr.xml it seems like I can specify the defaultCoreName 
in the cores-tag. Is this correct?


3. I create my cores dynamically and information about these are stored 
in zookeeper. Is it possible to store the defaultCoreName in zookeeper 
as well and where should I look to get information about how to this?


Thanks for any comments on this.

Best regards Trym


Re: Data Import Handler fields with different values in column and name

2012-06-01 Thread Jack Krupansky
James: Is there some particular DIH logging he can turn on to see what is 
really happening with his field name mapping? In other words, if DIH/Solr 
really is ignoring that field mapping, to find out exactly why.


-- Jack Krupansky

-Original Message- 
From: Dyer, James

Sent: Friday, June 01, 2012 10:50 AM
To: solr-user@lucene.apache.org
Subject: RE: Data Import Handler fields with different values in column and 
name


Are you leaving both mappings in there, like this...

entity name=documento query=SELECT iddocumento,nrodocumento,asunto FROM 
documento

field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
field column=asunto name=anotherasunto /
/entity

If so, I'm not sure you can map asunto to two different fields like this. 
For that, you may need to write a transformer that will duplicate asunto 
for you.  Although, in most cases all you need to do is add a copyField / 
in schema.xml to copy asunto to anotherasunto.  But a DIH Transformer 
would be helpful, for instance, if asunto is multi-valued but you only 
want to copy the first value to anotherasunto (perhaps you need to sort on 
it, which is not possible with multi-valued fields).


If this doesn't help, let us know exactly why you need to duplicate asunto 
and maybe you can get more help from there.


(If you're not trying to duplicate asunto and you're sure you've taken the 
duplicate out of data-config.xml, then go ahead and double-check spelling 
and case in all your config files.  Besides a typo somewhere, I'm not sure 
what else would cause this not to map.)


James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rafael Taboada [mailto:kaliman.fore...@gmail.com]
Sent: Thursday, May 31, 2012 4:13 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Data Import Handler fields with different values in column and 
name


Please,

Can anyone guide me through this issue? Thanks



-- Forwarded message --
From: Rafael Taboada kaliman.fore...@gmail.com
Date: Thu, May 31, 2012 at 12:30 PM
Subject: Data Import Handler fields with different values in column and name
To: solr-user@lucene.apache.org


Hi folks,

I'm using Solr 3.6 and I'm trying to import data from my database to solr
using Data Import Handler. My db-config is like this:

dataConfig
  dataSource driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@localhost:1521:XE user=admin password=admin /
  document
 entity name=documento query=SELECT
iddocumento,nrodocumento,asunto FROM documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
 /entity
  /document
/dataConfig

My problem is when I'm trying to use a different values in the field tag,
for example

field column=asunto name=anotherasunto /

When I use different name from column, this field is omitted. Please can
you help me with this issue?

My schema.xml is:

types
 fieldtype name=string class=solr.StrField sortMissingLast=true
/
  /types

  fields
 !-- general --
 field name=iddocumento type=string indexed=true stored=true
required=true /
 field name=nrodocumento type=string indexed=true stored=true
/
 field name=anotherasunto type=string indexed=true
stored=true /
  /fields

Thanks in advance!

--
Rafael Taboada






--
Rafael Taboada

/*
* Phone  992 741 026
*/ 



RE: per-fieldtype similarity not working

2012-06-01 Thread Markus Jelsma
Hi!


Ah, it makes sense now! This global configured similarity knows returns a 
fieldType defined similarity if available and if not the standard Lucene 
similarity. This would, i assume, mean that the two defined similarities below 
without per fieldType declared similarities would always yield the same results?

similarity class=org.apache.lucene.search.similarities.DefaultSimilarity/
similarity class=solr.SchemaSimilarityFactory/

I would assume because without per fieldType declared the 
SchemaSimilarityFactory returns the default lucene Similarity. However, when 
checking out it doesn't work for my url field but does work for the content and 
title field. I have defined the same similarity for the url fieldType as i did 
for the title fieldType. This is the output for solr.SchemaSimilarityFactory 
without per-field declared: 

  38.565483 = (MATCH) max plus 0.27 times others of:
5.434552 = (MATCH) weight(content:groning^1.4 in 384) [], result of:
  5.434552 = score(doc=384,freq=10.0 = termFreq=10.0
), product of:
1.5511217 = queryWeight, product of:
  1.4 = boost
  1.1079441 = idf(docFreq=1236, maxDocs=1378)
  1.0 = queryNorm
3.503627 = fieldWeight in 384, product of:
  3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
  1.1079441 = idf(docFreq=1236, maxDocs=1378)
  1.0 = fieldNorm(doc=384)
4.38 = (MATCH) weight(title:groning^4.7 in 384) [], result of:
  4.38 = score(doc=384,freq=2.0 = termFreq=2.0
), product of:
5.346149 = queryWeight, product of:
  4.7 = boost
  1.1374786 = idf(docFreq=1200, maxDocs=1378)
  1.0 = queryNorm
0.8043188 = fieldWeight in 384, product of:
  1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
  1.1374786 = idf(docFreq=1200, maxDocs=1378)
  0.5 = fieldNorm(doc=384)
35.937153 = (MATCH) weight(url:groning^2.1 in 384) [], result of:
  35.937153 = score(doc=384,freq=1.0 = termFreq=1.0
), product of:
10.988577 = queryWeight, product of:
  2.1 = boost
  5.232656 = idf(docFreq=19, maxDocs=1378)
  1.0 = queryNorm
3.27041 = fieldWeight in 384, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
  5.232656 = idf(docFreq=19, maxDocs=1378)
  0.625 = fieldNorm(doc=384)


Here's the output with DefaultSimilarity declared:

  3.2723136 = (MATCH) max plus 0.27 times others of:
0.46112633 = (MATCH) weight(content:groning^1.4 in 327) 
[DefaultSimilarity], result of:
  0.46112633 = score(doc=327,freq=10.0 = termFreq=10.0
), product of:
0.13161398 = queryWeight, product of:
  1.4 = boost
  1.1079441 = idf(docFreq=1236, maxDocs=1378)
  0.08485084 = queryNorm
3.503627 = fieldWeight in 327, product of:
  3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
  1.1079441 = idf(docFreq=1236, maxDocs=1378)
  1.0 = fieldNorm(doc=327)
0.36485928 = (MATCH) weight(title:groning^4.7 in 327) [DefaultSimilarity], 
result of:
  0.36485928 = score(doc=327,freq=2.0 = termFreq=2.0
), product of:
0.45362523 = queryWeight, product of:
  4.7 = boost
  1.1374786 = idf(docFreq=1200, maxDocs=1378)
  0.08485084 = queryNorm
0.8043188 = fieldWeight in 327, product of:
  1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
  1.1374786 = idf(docFreq=1200, maxDocs=1378)
  0.5 = fieldNorm(doc=327)
3.0492976 = (MATCH) weight(url:groning^2.1 in 327) [DefaultSimilarity], 
result of:It also seems the debug output is wrong, it does not write the 
similarity classname between [] and produces an empty [] for each match.
  3.0492976 = score(doc=327,freq=1.0 = termFreq=1.0
), product of:
0.93239 = queryWeight, product of:
  2.1 = boost
  5.232656 = idf(docFreq=19, maxDocs=1378)
  0.08485084 = queryNorm
3.27041 = fieldWeight in 327, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
  5.232656 = idf(docFreq=19, maxDocs=1378)
  0.625 = fieldNorm(doc=327)

How can i explain the difference? Also, with the factory declared, the score of 
the url field is still the same, it does not seem to listen to the per-field 
declared similarity. It also seems the debug output is wrong, it does not write 
the similarity classname between [] and produces an empty [] for each match.

Many thanks and a nice weekend!
Markus
 
 
-Original message-
 From:Robert Muir rcm...@gmail.com
 Sent: Fri 01-Jun-2012 17:00
 To: solr-user@lucene.apache.org
 Subject: Re: per-fieldtype similarity not working
 
 On Fri, Jun 1, 2012 at 5:13 AM, Markus Jelsma
 markus.jel...@openindex.io wrote:
  Thanks but i am clearly missing something? We declare the similarity in the 
  fieldType just as in the 

Re: Data Import Handler fields with different values in column and name

2012-06-01 Thread Rafael Taboada
Hi James,

I'm not duplicating fields. Just using one field asunto:

 field column=asunto name=anotherasunto /

Thanks for your help.

On Fri, Jun 1, 2012 at 9:50 AM, Dyer, James james.d...@ingrambook.comwrote:

 Are you leaving both mappings in there, like this...

 entity name=documento query=SELECT iddocumento,nrodocumento,asunto
 FROM documento
  field column=iddocumento name=iddocumento /
  field column=nrodocumento name=nrodocumento /
  field column=asunto name=asunto /
  field column=asunto name=anotherasunto /
 /entity

 If so, I'm not sure you can map asunto to two different fields like
 this.  For that, you may need to write a transformer that will duplicate
 asunto for you.  Although, in most cases all you need to do is add a
 copyField / in schema.xml to copy asunto to anotherasunto.  But a DIH
 Transformer would be helpful, for instance, if asunto is multi-valued but
 you only want to copy the first value to anotherasunto (perhaps you need
 to sort on it, which is not possible with multi-valued fields).

 If this doesn't help, let us know exactly why you need to duplicate
 asunto and maybe you can get more help from there.

 (If you're not trying to duplicate asunto and you're sure you've taken
 the duplicate out of data-config.xml, then go ahead and double-check
 spelling and case in all your config files.  Besides a typo somewhere, I'm
 not sure what else would cause this not to map.)

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: Rafael Taboada [mailto:kaliman.fore...@gmail.com]
 Sent: Thursday, May 31, 2012 4:13 PM
 To: solr-user@lucene.apache.org
 Subject: Fwd: Data Import Handler fields with different values in column
 and name

 Please,

 Can anyone guide me through this issue? Thanks



 -- Forwarded message --
 From: Rafael Taboada kaliman.fore...@gmail.com
 Date: Thu, May 31, 2012 at 12:30 PM
 Subject: Data Import Handler fields with different values in column and
 name
 To: solr-user@lucene.apache.org


 Hi folks,

 I'm using Solr 3.6 and I'm trying to import data from my database to solr
 using Data Import Handler. My db-config is like this:

 dataConfig
   dataSource driver=oracle.jdbc.OracleDriver
 url=jdbc:oracle:thin:@localhost:1521:XE user=admin password=admin /
   document
  entity name=documento query=SELECT
 iddocumento,nrodocumento,asunto FROM documento
 field column=iddocumento name=iddocumento /
 field column=nrodocumento name=nrodocumento /
 field column=asunto name=asunto /
  /entity
   /document
 /dataConfig

 My problem is when I'm trying to use a different values in the field tag,
 for example

 field column=asunto name=anotherasunto /

 When I use different name from column, this field is omitted. Please can
 you help me with this issue?

 My schema.xml is:

 types
  fieldtype name=string class=solr.StrField sortMissingLast=true
 /
   /types

   fields
  !-- general --
  field name=iddocumento type=string indexed=true stored=true
 required=true /
  field name=nrodocumento type=string indexed=true stored=true
 /
  field name=anotherasunto type=string indexed=true
 stored=true /
   /fields

 Thanks in advance!

 --
 Rafael Taboada






 --
 Rafael Taboada

 /*
  * Phone  992 741 026
  */




-- 
Rafael Taboada

/*
 * Phone  992 741 026
 */


Re: Data Import Handler fields with different values in column and name

2012-06-01 Thread Rafael Taboada
Hi Jack.

Logging just show import is successful.

Jun 1, 2012 8:50:38 AM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Jun 1, 2012 8:50:38 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit(optimize=false,waitFlush=false,waitSearcher=true,expungeDeletes=false)
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
commit{dir=/home/rafael/solr/data/index,segFN=segments_1,version=1338565818575,generation=1,filenames=[segments_1]
commit{dir=/home/rafael/solr/data/index,segFN=segments_2,version=1338565818584,generation=2,filenames=[_0.tis,
_3.frq, _3.tii, _1.frq, _3.fnm, _3.fdt, _2.tii, _1.fnm, _1.tii, _0.prx,
_3.nrm, _0.nrm, _1.tis, _0.fnm, _2.prx, _2.fdt, _2.frq, _3.prx, _2.fdx,
_2.fnm, _3.fdx, _1.prx, _1.fdx, _2.tis, _0.tii, _1.fdt, _0.frq, segments_2,
_0.fdx, _0.fdt, _1.nrm, _2.nrm, _3.tis]
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1338565818584
Jun 1, 2012 8:50:38 AM org.apache.solr.search.SolrIndexSearcher init
INFO: Opening Searcher@c16c2c0 main
Jun 1, 2012 8:50:38 AM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@c16c2c0 main from Searcher@19fac852 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 8:50:38 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Jun 1, 2012 8:50:38 AM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@c16c2c0 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 8:50:38 AM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@c16c2c0 main
Jun 1, 2012 8:50:38 AM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@19fac852 main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jun 1, 2012 8:50:38 AM
org.apache.solr.handler.dataimport.SimplePropertiesWriter
readIndexerProperties
INFO: Read dataimport.properties
Jun 1, 2012 8:50:38 AM
org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
INFO: Wrote last indexed time to dataimport.properties
Jun 1, 2012 8:50:38 AM org.apache.solr.handler.dataimport.DocBuilder execute
INFO: Time taken = 0:0:14.677
Jun 1, 2012 8:50:38 AM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {deleteByQuery=*:*,add=[94, 96, 177, 178, 179, 1082, 181, 1089, ...
(425311 adds)],commit=} 0 30
Jun 1, 2012 8:50:39 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:39 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:39 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:40 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={} status=0 QTime=0
Jun 1, 2012 8:50:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select/
params={indent=onstart=0q=*:*version=2.2rows=10} hits=425311 status=0
QTime=26

But I was trying with mysql database and import is OK:

dataConfig
dataSource driver=com.mysql.jdbc.Driver
url=jdbc:mysql://localhost:3306/solr user=solr password=solr /
document
entity name=usuario query=select idusuario,nombres,apellidos
from usuario
field column=idusuario name=idusuario /
field column=nombres name=nombres1 /
field column=apellidos name=apellidos1 /
/entity
/document
/dataConfig

 fields
   field name=idusuario type=sint indexed=true stored=true
required=true /
   field name=nombres1 type=string indexed=true stored=true /
   field name=apellidos1 type=string indexed=true stored=true/
 /fields

Is there any issue with Oracle? I think this is not the problem... But I
will export my data to mysql to dismiss this supposition.

Thanks for your help


On Fri, Jun 1, 2012 at 10:34 AM, Jack Krupansky 

Re: per-fieldtype similarity not working

2012-06-01 Thread Robert Muir
On Fri, Jun 1, 2012 at 11:39 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Hi!


 Ah, it makes sense now! This global configured similarity knows returns a 
 fieldType defined similarity if available and if not the standard Lucene 
 similarity. This would, i assume, mean that the two defined similarities 
 below without per fieldType declared similarities would always yield the same 
 results?

Not true: note that two methods (coord and querynorm) are not perfield
but global across the entire query tree.

By default these are disabled in the wrapper, as they only skew or
confuse most modern scoring algorithms (eg all the new ranking
algorithms in lucene 4) respectively.

So if you want to do per-field scoring where *all* of your sims are
vector-space, it could make sense to customize (e.g. subclass)
SchemaSimilarityFactory and do something useful for these methods.


-- 
lucidimagination.com


Re: Data Import Handler fields with different values in column and name

2012-06-01 Thread Rafael Taboada
Hi!

I think it works but using alias with Oracle database

entity name=documento query=SELECT iddocumento,nrodocumento,asunto as
asunto1,autor FROM documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto1 name=asunto1 /
field column=autor name=autor /
/entity

Am i wrong with this??? Anyone tried DIH with Oracle?

Thanks for your help



On Fri, Jun 1, 2012 at 10:34 AM, Jack Krupansky j...@basetechnology.comwrote:

 James: Is there some particular DIH logging he can turn on to see what is
 really happening with his field name mapping? In other words, if DIH/Solr
 really is ignoring that field mapping, to find out exactly why.

 -- Jack Krupansky

 -Original Message- From: Dyer, James
 Sent: Friday, June 01, 2012 10:50 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Data Import Handler fields with different values in column
 and name


 Are you leaving both mappings in there, like this...

 entity name=documento query=SELECT iddocumento,nrodocumento,**asunto
 FROM documento
 field column=iddocumento name=iddocumento /
 field column=nrodocumento name=nrodocumento /
 field column=asunto name=asunto /
 field column=asunto name=anotherasunto /
 /entity

 If so, I'm not sure you can map asunto to two different fields like
 this. For that, you may need to write a transformer that will duplicate
 asunto for you.  Although, in most cases all you need to do is add a
 copyField / in schema.xml to copy asunto to anotherasunto.  But a DIH
 Transformer would be helpful, for instance, if asunto is multi-valued but
 you only want to copy the first value to anotherasunto (perhaps you need
 to sort on it, which is not possible with multi-valued fields).

 If this doesn't help, let us know exactly why you need to duplicate
 asunto and maybe you can get more help from there.

 (If you're not trying to duplicate asunto and you're sure you've taken
 the duplicate out of data-config.xml, then go ahead and double-check
 spelling and case in all your config files.  Besides a typo somewhere, I'm
 not sure what else would cause this not to map.)

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: Rafael Taboada 
 [mailto:kaliman.forever@gmail.**comkaliman.fore...@gmail.com
 ]
 Sent: Thursday, May 31, 2012 4:13 PM
 To: solr-user@lucene.apache.org
 Subject: Fwd: Data Import Handler fields with different values in column
 and name

 Please,

 Can anyone guide me through this issue? Thanks



 -- Forwarded message --
 From: Rafael Taboada kaliman.fore...@gmail.com
 Date: Thu, May 31, 2012 at 12:30 PM
 Subject: Data Import Handler fields with different values in column and
 name
 To: solr-user@lucene.apache.org


 Hi folks,

 I'm using Solr 3.6 and I'm trying to import data from my database to solr
 using Data Import Handler. My db-config is like this:

 dataConfig
  dataSource driver=oracle.jdbc.**OracleDriver
 url=jdbc:oracle:thin:@**localhost:1521:XE user=admin password=admin
 /
  document
 entity name=documento query=SELECT
 iddocumento,nrodocumento,**asunto FROM documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
 /entity
  /document
 /dataConfig

 My problem is when I'm trying to use a different values in the field tag,
 for example

field column=asunto name=anotherasunto /

 When I use different name from column, this field is omitted. Please can
 you help me with this issue?

 My schema.xml is:

 types
 fieldtype name=string class=solr.StrField sortMissingLast=true
 /
  /types

  fields
 !-- general --
 field name=iddocumento type=string indexed=true stored=true
 required=true /
 field name=nrodocumento type=string indexed=true stored=true
 /
 field name=anotherasunto type=string indexed=true
 stored=true /
  /fields

 Thanks in advance!

 --
 Rafael Taboada






 --
 Rafael Taboada

 /*
 * Phone  992 741 026
 */




-- 
Rafael Taboada

/*
 * Phone  992 741 026
 */


Re: EventListeners of DIH

2012-06-01 Thread khuram120
I am looking to do the same, I also want to update a table after every
document is updated/added/deleted from the Solr index. OnImportStart and End
only works at the beginning and end of the imports, but I need an event
which should be fired after each document in the index rather than the one
which fires at the end.

If there is no event like this, there should be one. That's my suggestion.

What you did to achieve your requirement of table update after each
document, please let me know. It will be highly appreciated.

Thanks in advance.   

--
View this message in context: 
http://lucene.472066.n3.nabble.com/EventListeners-of-DIH-tp497539p3987270.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr, how can I make search query with fixed slop(distance)

2012-06-01 Thread Jihyun Suh
I want to search data within fixed slop in Solr.

For example, I make search query 'title:+solr +user ~2' for search some
data which have 'solr' and 'user' within 2 slops. But it's not working in
Solr. I get some parameter, defType=edismax, pf, qs, ps. It's not change
the search result, but order.

If I use Phrase Query just like 'title:solr user~2', it can't get the
result just like ... users for solr ... which have not keywords in order.

How Can I do? Help me.


Re: I got ERROR, Unable to execute query

2012-06-01 Thread Jack Krupansky
Is test_5 created by a stored procedure? If so, is there a possibility that 
the stored procedure may have done an update and not returned data - but 
just sometimes?


-- Jack Krupansky

-Original Message- 
From: Jihyun Suh

Sent: Friday, June 01, 2012 12:02 PM
To: solr-user-h...@lucene.apache.org ; solr-user@lucene.apache.org
Subject: I got ERROR, Unable to execute query

I use many tables for indexing.

During dataimport, I get errors for some tables like Unable to execute
query. But next time, when I try to dataimport for that table, I can do
successfully without any error.

[Thread-17] ERROR o.a.s.h.d.EntityProcessorWrapper - Exception in entity :
test_5:org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query:
SELECT Title, url, synonym, description FROM test_5 WHERE status in
('1','s') Processing Document # 11046

at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:596)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)

I use many tables for indexing.

During dataimport, I get errors for some tables like Unable to execute
query. But next time, when I try to dataimport for that table, I can do
successfully without any error.

[Thread-17] ERROR o.a.s.h.d.EntityProcessorWrapper - Exception in entity :
test_5:org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query:
SELECT Title, url, synonym, description FROM test_5 WHERE status in
('1','s') Processing Document # 11046

at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:596)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408) 



Re: eliminate adminPath tag from solr.xml file?

2012-06-01 Thread Chris Hostetter

: http://wiki.apache.org/solr/CoreAdmin#Core_Administration
: 
: if you wanted to eliminate administration of the core from the web site,
: 
: could you eliminate either solr.xml or remove the 
: 
: cores adminPath=/admin/cores from the solr.xml file?

As mentioned on that page...

adminPath - Relative path to access the CoreAdminHandler for dynamic core 
manipulation. For example, adminPath=/admin/cores configures access via 
http://localhost:8983/solr/admin/cores. If this attribute is not 
specified, dynamic manipulation is unavailable. 



-Hoss


RE: Data Import Handler fields with different values in column and name

2012-06-01 Thread Dyer, James
I do not see any logging statements in the code, so I don't think there's 
anything on that end that can be done.

It would be easy, though, if he is using multiple mappings to remove the 
duplicate and see if that solves it.  From a more-thorough review of the code, 
though, I think my intial hunch was wrong.  It does seem as if you can have 
multiple mappings.  Then again, I didn't check for a unit test on this so even 
if the code is designed to allow it, it might not work (it doesn't seem like a 
feature people would expect to work either). In troubleshooting this one, I'd 
definitely try it without multiple mappings to see if it fixes it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Friday, June 01, 2012 10:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Data Import Handler fields with different values in column and name

James: Is there some particular DIH logging he can turn on to see what is 
really happening with his field name mapping? In other words, if DIH/Solr 
really is ignoring that field mapping, to find out exactly why.

-- Jack Krupansky

-Original Message- 
From: Dyer, James
Sent: Friday, June 01, 2012 10:50 AM
To: solr-user@lucene.apache.org
Subject: RE: Data Import Handler fields with different values in column and 
name

Are you leaving both mappings in there, like this...

entity name=documento query=SELECT iddocumento,nrodocumento,asunto FROM 
documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
field column=asunto name=anotherasunto /
/entity

If so, I'm not sure you can map asunto to two different fields like this. 
For that, you may need to write a transformer that will duplicate asunto 
for you.  Although, in most cases all you need to do is add a copyField / 
in schema.xml to copy asunto to anotherasunto.  But a DIH Transformer 
would be helpful, for instance, if asunto is multi-valued but you only 
want to copy the first value to anotherasunto (perhaps you need to sort on 
it, which is not possible with multi-valued fields).

If this doesn't help, let us know exactly why you need to duplicate asunto 
and maybe you can get more help from there.

(If you're not trying to duplicate asunto and you're sure you've taken the 
duplicate out of data-config.xml, then go ahead and double-check spelling 
and case in all your config files.  Besides a typo somewhere, I'm not sure 
what else would cause this not to map.)

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rafael Taboada [mailto:kaliman.fore...@gmail.com]
Sent: Thursday, May 31, 2012 4:13 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Data Import Handler fields with different values in column and 
name

Please,

Can anyone guide me through this issue? Thanks



-- Forwarded message --
From: Rafael Taboada kaliman.fore...@gmail.com
Date: Thu, May 31, 2012 at 12:30 PM
Subject: Data Import Handler fields with different values in column and name
To: solr-user@lucene.apache.org


Hi folks,

I'm using Solr 3.6 and I'm trying to import data from my database to solr
using Data Import Handler. My db-config is like this:

dataConfig
   dataSource driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@localhost:1521:XE user=admin password=admin /
   document
  entity name=documento query=SELECT
iddocumento,nrodocumento,asunto FROM documento
 field column=iddocumento name=iddocumento /
 field column=nrodocumento name=nrodocumento /
 field column=asunto name=asunto /
  /entity
   /document
/dataConfig

My problem is when I'm trying to use a different values in the field tag,
for example

 field column=asunto name=anotherasunto /

When I use different name from column, this field is omitted. Please can
you help me with this issue?

My schema.xml is:

types
  fieldtype name=string class=solr.StrField sortMissingLast=true
/
   /types

   fields
  !-- general --
  field name=iddocumento type=string indexed=true stored=true
required=true /
  field name=nrodocumento type=string indexed=true stored=true
/
  field name=anotherasunto type=string indexed=true
stored=true /
   /fields

Thanks in advance!

-- 
Rafael Taboada






-- 
Rafael Taboada

/*
* Phone  992 741 026
*/ 



Re: possible status codes from solr during a (DIH) data import process

2012-06-01 Thread Savvas Andreas Moysidis
Hello,

Driven by the same requirements we also implemented the same polling
mechanism (in java) and found it a bit awkward and error prone having
to search through the returned response for occurrences of the terms
failure or Rollback etc.
It would be *really* handy if the status command returned numeric
values to reflect the current state of the DIH process (similar to the
HTTP status codes a server sends to a web browser).

Our 2 cents.. :)

On 1 June 2012 15:29, geeky2 gee...@hotmail.com wrote:
 thank you ALL for the great feedback - very much appreciated!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987263.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to find the age of a page

2012-06-01 Thread Jack Krupansky
If you uncomment the timestamp field in the Solr example, Solr will 
automatically initialize it for each new document to be the time when the 
document is indexed (or most recently indexed). Any field declared with 
default=NOW and not explicitly initialized will have the current time when 
indexed (or re-indexed.)


-- Jack Krupansky

-Original Message- 
From: in.abdul

Sent: Friday, June 01, 2012 6:55 AM
To: solr-user@lucene.apache.org
Subject: Re: How to find the age of a page

Shameema Umer,

you can add another one new field in schema ..  while updating or indexing
add the time stamp to that current field ..

   Thanks and Regards,
   S SYED ABDUL KATHER



On Fri, Jun 1, 2012 at 3:44 PM, Shameema Umer [via Lucene] 
ml-node+s472066n3987234...@n3.nabble.com wrote:


Hi all,

How can i find the age of a page solr results? that is the last updated
time.
tstamp refers to the fetch time, not the exact updated time, right?


--
 If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/How-to-find-the-age-of-a-page-tp3987234.html
 To unsubscribe from Lucene, click 
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472066code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw

.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




-
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-the-age-of-a-page-tp3987234p3987238.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Data Import Handler fields with different values in column and name

2012-06-01 Thread Jack Krupansky

I'm still looking, but I do see this config for a unit test:

public static final String dc_singleEntity = dataConfig\n
+ dataSource  type=\MockDataSource\/\n
+ document name=\X\ \n
+ entity name=\x\ query=\select * from x\\n
+   field column=\id\/\n
+   field column=\desc\/\n
+   field column=\desc\ name=\desc_s\ / +  
/entity\n

+ /document\n + /dataConfig;

Suggesting that you can have multiple output fields for a single input 
column. But I need to read the test more closely.
TestDocBuilder.singleEntityOneRow. I'm not sure it is checking all output 
fields.


-- Jack Krupansky
-Original Message- 
From: Dyer, James

Sent: Friday, June 01, 2012 1:30 PM
To: solr-user@lucene.apache.org
Subject: RE: Data Import Handler fields with different values in column and 
name


I do not see any logging statements in the code, so I don't think there's 
anything on that end that can be done.


It would be easy, though, if he is using multiple mappings to remove the 
duplicate and see if that solves it.  From a more-thorough review of the 
code, though, I think my intial hunch was wrong.  It does seem as if you can 
have multiple mappings.  Then again, I didn't check for a unit test on this 
so even if the code is designed to allow it, it might not work (it doesn't 
seem like a feature people would expect to work either). In troubleshooting 
this one, I'd definitely try it without multiple mappings to see if it fixes 
it.


James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Friday, June 01, 2012 10:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Data Import Handler fields with different values in column and 
name


James: Is there some particular DIH logging he can turn on to see what is
really happening with his field name mapping? In other words, if DIH/Solr
really is ignoring that field mapping, to find out exactly why.

-- Jack Krupansky

-Original Message- 
From: Dyer, James

Sent: Friday, June 01, 2012 10:50 AM
To: solr-user@lucene.apache.org
Subject: RE: Data Import Handler fields with different values in column and
name

Are you leaving both mappings in there, like this...

entity name=documento query=SELECT iddocumento,nrodocumento,asunto FROM
documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
field column=asunto name=anotherasunto /
/entity

If so, I'm not sure you can map asunto to two different fields like this.
For that, you may need to write a transformer that will duplicate asunto
for you.  Although, in most cases all you need to do is add a copyField /
in schema.xml to copy asunto to anotherasunto.  But a DIH Transformer
would be helpful, for instance, if asunto is multi-valued but you only
want to copy the first value to anotherasunto (perhaps you need to sort on
it, which is not possible with multi-valued fields).

If this doesn't help, let us know exactly why you need to duplicate asunto
and maybe you can get more help from there.

(If you're not trying to duplicate asunto and you're sure you've taken the
duplicate out of data-config.xml, then go ahead and double-check spelling
and case in all your config files.  Besides a typo somewhere, I'm not sure
what else would cause this not to map.)

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rafael Taboada [mailto:kaliman.fore...@gmail.com]
Sent: Thursday, May 31, 2012 4:13 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Data Import Handler fields with different values in column and
name

Please,

Can anyone guide me through this issue? Thanks



-- Forwarded message --
From: Rafael Taboada kaliman.fore...@gmail.com
Date: Thu, May 31, 2012 at 12:30 PM
Subject: Data Import Handler fields with different values in column and name
To: solr-user@lucene.apache.org


Hi folks,

I'm using Solr 3.6 and I'm trying to import data from my database to solr
using Data Import Handler. My db-config is like this:

dataConfig
  dataSource driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@localhost:1521:XE user=admin password=admin /
  document
 entity name=documento query=SELECT
iddocumento,nrodocumento,asunto FROM documento
field column=iddocumento name=iddocumento /
field column=nrodocumento name=nrodocumento /
field column=asunto name=asunto /
 /entity
  /document
/dataConfig

My problem is when I'm trying to use a different values in the field tag,
for example

field column=asunto name=anotherasunto /

When I use different name from column, this field is omitted. Please can
you help me with this issue?

My schema.xml is:

types
 fieldtype name=string class=solr.StrField sortMissingLast=true
/
  /types

  fields
 !-- general --
 field 

Lucene/Solr Search Engineers

2012-06-01 Thread SV
Hi,

We are hiring multiple Lucene/Solr engineers, tech leads, architects based
in Minneapolis - both full time and consulting for developing new search
platform.

Please reach out to me - svamb...@gmail.com

Thanks,
Venkat Ambati
Sr. Manager, Best Buy


Re: A few random questions about solr queries.

2012-06-01 Thread Shawn Heisey

On 5/29/2012 4:18 AM, santamaria2 wrote:

*3)* I've rummaged around a bit, looking for info on when to use q vs fq. I
want to clear my doubts for a certain use case.

Where should my date range queries go? In q or fq? The default settings in
my site show results from the past 90 days with buttons to show stuff from
the last month and week as well. But the user is allowed to use a slider to
apply any date range... this is allowed, but it's not /that/ common.
I definitely use fq for filtering various tags. Choosing a tag is a common
activity.


I can't answer your facet questions, but this one I can.  If you are 
using the default relevancy ranking and you do not want the values in a 
given part of your search to affect the score, put it in a filter query 
(fq).  Also, if you are sorting all your search results in a 
deterministic way rather than using relevancy, use a filter query.


If you do want those values to affect the score, which is normal for 
fulltext fields, put your search clause in the regular query (q).  Most 
of the time, a date range is not something that you want to affect the 
relevancy score, so it is a perfect candidate for filter queries.


Thanks,
Shawn



Re: possible status codes from solr during a (DIH) data import process

2012-06-01 Thread Shawn Heisey

On 6/1/2012 11:51 AM, Savvas Andreas Moysidis wrote:

Hello,

Driven by the same requirements we also implemented the same polling
mechanism (in java) and found it a bit awkward and error prone having
to search through the returned response for occurrences of the terms
failure or Rollback etc.
It would be *really* handy if the status command returned numeric
values to reflect the current state of the DIH process (similar to the
HTTP status codes a server sends to a web browser).

Our 2 cents.. :)

On 1 June 2012 15:29, geeky2gee...@hotmail.com  wrote:

thank you ALL for the great feedback - very much appreciated!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987263.html
Sent from the Solr - User mailing list archive at Nabble.com.


I have filed some Jira issues on DIH status, and created a patch for one 
of them.


https://issues.apache.org/jira/browse/SOLR-2729
https://issues.apache.org/jira/browse/SOLR-2728

I thought I had filed an issue for redoing the status response so 
there's a machine readable section and a human readable section, but now 
I can't seem to find it, so perhaps I never did.


Thanks,
Shawn



Re: Difference between textfield and strfield

2012-06-01 Thread Gau
is there any other option to sorting. I mean, sorting can affect query
performance. Is there a way to embed this into Solr and not have a toll on
the system,

I tried boosting the scores based on strdist, but that seems to bring in
more results than expected.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-textfield-and-strfield-tp3986916p3987338.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr, how can I make search query with fixed slop(distance)

2012-06-01 Thread Jack Krupansky

Take a look at the Surround Query Parser that lets you do span queries:
http://wiki.apache.org/solr/SurroundQueryParser

  solr  2w user

But, they are very simple, maybe too simple. OTOH, you may be able to 
combine them with nested queries.


The Lucid Imagination LucidWorks Enterprise product has support for span 
queries:

http://lucidworks.lucidimagination.com/display/lweug/Proximity+Operations

 title:  solr before:2 user

But that won't help you if you are using only Solr.

-- Jack Krupansky

-Original Message- 
From: Jihyun Suh

Sent: Friday, June 01, 2012 12:08 PM
To: solr-user-...@lucene.apache.org ; solr-user@lucene.apache.org
Subject: solr, how can I make search query with fixed slop(distance)

I want to search data within fixed slop in Solr.

For example, I make search query 'title:+solr +user ~2' for search some
data which have 'solr' and 'user' within 2 slops. But it's not working in
Solr. I get some parameter, defType=edismax, pf, qs, ps. It's not change
the search result, but order.

If I use Phrase Query just like 'title:solr user~2', it can't get the
result just like ... users for solr ... which have not keywords in order.

How Can I do? Help me. 



Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-06-01 Thread Niran Fajemisin
So I was able to run some additional tests today on this. I tried to use a 
stored function instead of a stored procedure. The hope was that the Stored 
Function would simply be a wrapper for the Store Procedure and would simply 
return the cursor as the return value. This unfortunately did not work.

My test attempted to call the function from the query attribute of the entity 
tag as such:  
{call my_stored_func()}

It raised an error stating that: 'my_stored_func' is not a procedure or is 
undefined.  This makes sense because the invocation format above is customarily 
reserved for a stored procedure.

So then I tried the typical approach for invoking a function which would be:
{call ? := my_stored_function()}

And as expected this resulted in an error stating that: not all variables bound 
. Again, this is expected as the ? notation would be the placeholder 
parameter that would be bound to the OracleTypes.CURSOR constant in a typical 
JDBC program.

Note that this function has been tested outside of DIH and it works when 
properly invoked.

I think the bottom-line here is that there is no proper support for stored 
procedures (or functions for that matter) in DIH. This is really unfortunate 
because anyone thinking of doing any significant processing in the source RDBMS 
prior to data export would have to look elsewhere. Short of adding this 
functionality to the JdbcDataSource class of the DIH, I think I'm at a dead end.

If anyone knows of any alternatives I would greatly appreciate hearing them.

Thanks for the responses as usual.

Cheers.





 From: Lance Norskog goks...@gmail.com
To: solr-user@lucene.apache.org; Niran Fajemisin afa...@yahoo.com 
Sent: Thursday, May 31, 2012 3:09 PM
Subject: Re: Using Data Import Handler to invoke a stored procedure with 
output (cursor) parameter
 
Can you add a new stored procedure that uses your current one? It
would operate like the DIH expects.

I don't remember if DB cursors are a standard part of JDBC. If they
are, it would be a great addition to the DIH if they work right.

On Thu, May 31, 2012 at 10:44 AM, Niran Fajemisin afa...@yahoo.com wrote:
 Thanks for your response, Michael. Unfortunately changing the stored 
 procedure is not really an option here.

 From what I'm seeing, it would appear that there's really no way of somehow 
 instructing the Data Import Handler to get a handle on the output parameter 
 from the stored procedure. It's a bit surprising though that no one has ran 
 into this scenario but I suppose most people just work around it.

 Anyone else care to shed some more light on alternative approaches? Thanks 
 again.




 From: Michael Della Bitta michael.della.bi...@appinions.com
To: solr-user@lucene.apache.org
Sent: Thursday, May 31, 2012 9:40 AM
Subject: Re: Using Data Import Handler to invoke a stored procedure with 
output (cursor) parameter

I could be wrong about this, but Oracle has a table() function that I
believe turns the output of a function as a table. So possibly you
could wrap your procedure in a function that returns the cursor, or
convert the procedure to a function.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Thu, May 31, 2012 at 8:00 AM, Niran Fajemisin afa...@yahoo.com wrote:
 Hi all,

 I've seen a few questions asked around invoking stored procedures from 
 within Data Import Handler but none of them seem to indicate what type of 
 output parameters were being used.

 I have a stored procedure created in Oracle database that takes a couple 
 input parameters and has an output parameter that is a reference cursor. 
 The cursor is expected to be used as a way of iterating through the 
 returned table rows. I'm using the following format to invoke my stored 
 procedure in the Data Import Handler's data config XML:

 entity name=entity_name ... query={call my_stored_proc(inParam1, 
 inParam2)} .../entity

 I have tested that this query works prior to attempting to use it from 
 within the DIH. But when I attempt to invoke this stored procedure, it 
 naturally complains that the output parameter is not specified 
 (essentially a mismatch in the number of parameters).

 I don't know of anyway to pass in a cursor parameter (or any output 
 parameter for that matter) to the stored procedure invocation from within 
 the entity definition.  I would greatly appreciate if anyone could 
 provide any pointers or hints on how to proceed.

 Thanks so much for your time







-- 
Lance Norskog
goks...@gmail.com




Re: Replacing payloads for per-document-per-keyword scores

2012-06-01 Thread Chris Hostetter
:  Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
:  for the per-keyword scores, since keywords (tags) always have a Term
:  Frequency of 1 and the TF calculation is very fast. However it turns out
:  that you can't[2] specify TF in the PreAnalyzedField.

Yeah ... sorry for stearing you in the wrong direction there.

Mikhail's suggesting is dead on what i thought you could 
already do with PreAnalyzedField...

: if manipulating tf is a possible approach, why don't extend
: KeywordTokenizer to make it work in the following manner:
: 
: 3|wheel - {wheel,wheel,wheel}
: 
: it will allow supply your per-term-per-doc boosts as a prefixes for field
: values and multiply them during indexing internally.

..to be clear, this won't/shouldn't be as inefficient and memory bloated 
as it sounds because you don't actaully have to copy the Term N times --  
You should just be able to have the TokenStream you return from your 
Tokenizer implement incrementToken() by simply incrementing a counter and 
returning true until it's been called N times, w/o modifying any other 
state.

Or at least ... that's my theory ... i've been wrong before.

-Hoss


Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-06-01 Thread Michael Della Bitta
Apologies for the terseness of this reply, as I'm on my mobile.

To treat the result of a function call as a table in Oracle SQL, use the
table() function, like this:

select * from table(my_stored_func())

HTH,

Michael
On Jun 1, 2012 8:01 PM, Niran Fajemisin afa...@yahoo.com wrote:

 So I was able to run some additional tests today on this. I tried to use a
 stored function instead of a stored procedure. The hope was that the Stored
 Function would simply be a wrapper for the Store Procedure and would simply
 return the cursor as the return value. This unfortunately did not work.

 My test attempted to call the function from the query attribute of the
 entity tag as such:
 {call my_stored_func()}

 It raised an error stating that: 'my_stored_func' is not a procedure or is
 undefined.  This makes sense because the invocation format above is
 customarily reserved for a stored procedure.

 So then I tried the typical approach for invoking a function which would
 be:
 {call ? := my_stored_function()}

 And as expected this resulted in an error stating that: not all variables
 bound . Again, this is expected as the ? notation would be the
 placeholder parameter that would be bound to the OracleTypes.CURSOR
 constant in a typical JDBC program.

 Note that this function has been tested outside of DIH and it works when
 properly invoked.

 I think the bottom-line here is that there is no proper support for stored
 procedures (or functions for that matter) in DIH. This is really
 unfortunate because anyone thinking of doing any significant processing in
 the source RDBMS prior to data export would have to look elsewhere. Short
 of adding this functionality to the JdbcDataSource class of the DIH, I
 think I'm at a dead end.

 If anyone knows of any alternatives I would greatly appreciate hearing
 them.

 Thanks for the responses as usual.

 Cheers.




 
  From: Lance Norskog goks...@gmail.com
 To: solr-user@lucene.apache.org; Niran Fajemisin afa...@yahoo.com
 Sent: Thursday, May 31, 2012 3:09 PM
 Subject: Re: Using Data Import Handler to invoke a stored procedure with
 output (cursor) parameter
 
 Can you add a new stored procedure that uses your current one? It
 would operate like the DIH expects.
 
 I don't remember if DB cursors are a standard part of JDBC. If they
 are, it would be a great addition to the DIH if they work right.
 
 On Thu, May 31, 2012 at 10:44 AM, Niran Fajemisin afa...@yahoo.com
 wrote:
  Thanks for your response, Michael. Unfortunately changing the stored
 procedure is not really an option here.
 
  From what I'm seeing, it would appear that there's really no way of
 somehow instructing the Data Import Handler to get a handle on the output
 parameter from the stored procedure. It's a bit surprising though that no
 one has ran into this scenario but I suppose most people just work around
 it.
 
  Anyone else care to shed some more light on alternative approaches?
 Thanks again.
 
 
 
 
  From: Michael Della Bitta michael.della.bi...@appinions.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, May 31, 2012 9:40 AM
 Subject: Re: Using Data Import Handler to invoke a stored procedure
 with output (cursor) parameter
 
 I could be wrong about this, but Oracle has a table() function that I
 believe turns the output of a function as a table. So possibly you
 could wrap your procedure in a function that returns the cursor, or
 convert the procedure to a function.
 
 Michael Della Bitta
 
 
 Appinions, Inc. -- Where Influence Isn’t a Game.
 http://www.appinions.com
 
 
 On Thu, May 31, 2012 at 8:00 AM, Niran Fajemisin afa...@yahoo.com
 wrote:
  Hi all,
 
  I've seen a few questions asked around invoking stored procedures
 from within Data Import Handler but none of them seem to indicate what type
 of output parameters were being used.
 
  I have a stored procedure created in Oracle database that takes a
 couple input parameters and has an output parameter that is a reference
 cursor. The cursor is expected to be used as a way of iterating through the
 returned table rows. I'm using the following format to invoke my stored
 procedure in the Data Import Handler's data config XML:
 
  entity name=entity_name ... query={call my_stored_proc(inParam1,
 inParam2)} .../entity
 
  I have tested that this query works prior to attempting to use it
 from within the DIH. But when I attempt to invoke this stored procedure, it
 naturally complains that the output parameter is not specified (essentially
 a mismatch in the number of parameters).
 
  I don't know of anyway to pass in a cursor parameter (or any output
 parameter for that matter) to the stored procedure invocation from within
 the entity definition.  I would greatly appreciate if anyone could
 provide any pointers or hints on how to proceed.
 
  Thanks so much for your time
 
 
 
 
 
 
 
 --
 Lance Norskog
 goks...@gmail.com
 
 
 


solrDocumentList

2012-06-01 Thread gopes
We are using Lucid UI and solr to index our collection of xml files.  I am
getting the solrDocumentList like this 
[SolrDocument[{id=1331226833510, Street_Addr=[113 113TH ST], name=[113 113TH
ST SASKATOON SK S7N1V8], Municipality_Name=[SASKATOON], Province_Code=[SK],
Postal_Code=[S7N1V8]}]

But I need to have a response like this  SolrDocument[{id=1330247542287,
Municipality_Name=SASKATOON, Province_Code=SK, Postal_Code=S7N3Z1,
Street_Addr=3-THE BROADWAY UNIVERSITY DR}]
 
Can any one help me where I am going wrong or is there any changes required
in the schema.xml files.

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solrDocumentList-tp3987347.html
Sent from the Solr - User mailing list archive at Nabble.com.