Re: Missing required field: id Using ExtractingRequestHandler

2009-03-20 Thread Larry Reid
Doh! I get it. Ignore my questions in the previous e-mail. The XML files
have the id in them. For Word/Excel/PDF etc., it's up to the client
(crawler) or whatever to create a unique id if I want a unique id.

Thanks again for pointing me in the right direction. I'm really
impressed with how easy it's been for a non-Java/web app guy to get Solr
going. Excellent work!

On Thu, 2009-03-19 at 16:51 -0700, Chris Harris wrote:

 Unless there's a regression in the ExtractingRequestHandler, then this
 should be caused because both
 
 A) you have an id field defined in your solr schema file that's marked
 as a required field
 
 and
 
 B) you did not specify an ID parameter when you submitted your
 document to the handler.
 
 If you don't want your Solr docs to have an id field, then mark that
 field as not required in your schema.
 
 If you *do* want your Solr docs to have a required field called id,
 then you'll need to specify the ID when you submit your document. One
 way is using an ext.literal parameter, more or less like this:
 
 startofURL...ext.literal.id=13...restofURL
 
 Alternatively, you can try the field mapping mechanism, which is
 hopefully described on the wiki page.
 
 Cheers,
 Chris
 
 On Thu, Mar 19, 2009 at 3:46 PM, Larry Reid lcr...@jadesystems.ca wrote:
  I trying to index Word, PDF and other documents with Solr. I installed
  the latest nightly build of Solr on March 17. I followed the
  instructions in the Wiki for ExtractingRequestHandler at
  http://wiki.apache.org/solr/ExtractingRequestHandler#head-c95841f9eda007b6b4e4594ead12a04223cf7b6e.
 
  I have produced text output from tiki in the nightly build directories
  from PDF files.
 
  When I try the suggested test curl commands in the Getting Started with
  the Solr Examle section of the Wiki page, I get the following. Any idea
  what I've done wrong? Thanks in advance for your help.
 
  $ curl http://localhost:8983/solr/update/extract?ext.idx.attr=true
  \ext.def.fl=text -F myfi...@tutorial.pdf
  html
  head
  meta http-equiv=Content-Type content=text/html;
  charset=ISO-8859-1/
  titleError 500 /title
  /head
  bodyh2HTTP ERROR: 500/h2preorg.apache.solr.common.SolrException:
  Document [null] missing required field: id
 
  org.apache.solr.common.SolrException:
  org.apache.solr.common.SolrException: Document [null] missing required
  field: id
 at
  org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169)
 at
  org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
 at
  org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333)
 at
  org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
 at
  org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at org.mortbay.jetty.servlet.ServletHandler
  $CachedChain.doFilter(ServletHandler.java:1089)
 at
  org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
  org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
  org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
  org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at
  org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
  org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
 at
  org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at
  org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at
  org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at org.mortbay.jetty.HttpConnection
  $RequestHandler.content(HttpConnection.java:835)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at org.mortbay.jetty.bio.SocketConnector
  $Connection.run(SocketConnector.java:226)
 at org.mortbay.thread.BoundedThreadPool
  $PoolThread.run(BoundedThreadPool.java:442)
  Caused by: org.apache.solr.common.SolrException: Document [null] missing
  required field: id
 at
  org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292)
 at
  org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
 at
  org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90)
 at
  org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95)
 at
  

Issue with Facet Query

2009-03-20 Thread dabboo

Hi,

I am searching the indexes with facet query. Below is the query.

q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020]

It is giving me an exception saying:

str name=exceptionorg.apache.solr.common.SolrException: undefined field
productPrice_product_str_s:[0 TO 20] at
org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:994) at
org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:152) at
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:182)
at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:96)
at
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:70)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute

Can someone please guide me, how to prevent this exception. I guess, I am
missing some entries in some config file like solrConfig or schema. I would
appreciate if someone can tell me the specific entries, I need to make in
any config file.

Thanks a lot.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22615577.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 1:14 PM, dabboo ag...@sapient.com wrote:


 Hi,

 I am searching the indexes with facet query. Below is the query.


 q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020]


facet.field takes a field name. It does not accept queries. Use facet.query
for getting count of a query. Use fq to restrict facets by a certain query.

See http://wiki.apache.org/solr/SimpleFacetParameters
-- 
Regards,
Shalin Shekhar Mangar.


Re: Issue with Facet Query

2009-03-20 Thread dabboo

Thanks a lot for this information. But is there any way, I can impose the
range on the facet. 
for e.g. If I want to search the data between a specific range, how should I
form my query.

Do I need to make some entries some where.

Thanks,
Amit Garg



Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 1:14 PM, dabboo ag...@sapient.com wrote:
 

 Hi,

 I am searching the indexes with facet query. Below is the query.


 q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020]

 
 facet.field takes a field name. It does not accept queries. Use
 facet.query
 for getting count of a query. Use fq to restrict facets by a certain
 query.
 
 See http://wiki.apache.org/solr/SimpleFacetParameters
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22615979.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

== where are you seeing it as Solène as opposed to the   
correct way of solène? 

I have Solène in my Mysql DATA BASE ! so i don't know if this is correct
or not ? i gess that Solène is solène in UTF-8 ?!

I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
when i try with solène everything is ok ! but when i try with Solène (like
what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!!

I think that ISOLatin1AccentFilterFactory take only string with Charset
ISO-8859-1 .

So any solution to transform my string to ISO-8859-1 before indexing
process. May be by creating transformer in DataImportHandler ? (Never code
in java :( )

Thank you all.


Koji Sekiguchi-2 wrote:
 
 aerox7 wrote:
 Hi,
 I have a mysql data base in UTF-8. I have a row with Solène (solène).
 I
 want to transforme this to solene, so i use Solr
 ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!!

 i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8
 so
 normaly ISOLatin1AccentFilterFactory have to replace the accent ...

 any ideas ?

 i use DataImportHandler.
   
 
 If a mapping rule è to e is always true in your field, you can try 
 to use MappingCharFilter
 instead of ISOLatin1AccentFilter. Add the following line to 
 mapping-ISOLatin1Accent.txt:
 
 è = e
 
 and add the following fieldType:
 
 fieldType name=textCharNorm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 
 MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.
 
 Koji
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Facet Query

2009-03-20 Thread dabboo

Shalin, thanks a lot. One quick question:

Now, after putting the query in the way, you suggested, I am getting:

- lst name=facet_counts
- lst name=facet_queries
  int name=productPrice_product_s:[0 TO 20]23315/int 
  /lst
  lst name=facet_fields / 
  lst name=facet_dates / 
  /lst

But it is not returning me records. Do I need to enter this field entry in
schema.xml to get the records or anywhere else.

Thanks,
Amit Garg


Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 1:49 PM, dabboo ag...@sapient.com wrote:
 

 Thanks a lot for this information. But is there any way, I can impose the
 range on the facet.
 for e.g. If I want to search the data between a specific range, how
 should
 I
 form my query.

 
 Use a filter query, fq=productPrice_product_str_s:[0 TO 20]
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22616536.html
Sent from the Solr - User mailing list archive at Nabble.com.



Search transparently with Solr with multiple cores, different indexes, common response type

2009-03-20 Thread Giovanni De Stefano
Hello all,

here I am with another question... :-)

I figured that I have to change approach to implement the requirements I
have :-(

Here it is what I have to index:

1) data A in an Oracle DB Table A
2) data B in an Oracle DB Table B
3) data C in different files

Data A, B, and C are slightly different, thus they are indexed
differently; obviously the client receives the search results for all data
types in a consistent/common format. The client application shall be able to
search among each or all data types (A, B, C). The order will be
configurable, like: return the first 5 from data A, the first 10 from B,
all C.

At first I thought of using only one Solr with different datasources, and
one huge index, but I figured that delta imports would be very
hard/expensive/impossible.

Reading some other posts I thought that maybe a better approach would be as
following:

1) one Solr core for each data type (one for A, one for B, one for C)
2) one index fora each data type, thus one document type for A, one for
B, and one for C
3) client applications shall be able to search on one or all cores
4) the cores shall return search results in a common XML format
5) search results shall be aggregated in a configurable way

Can you please tell me if this architecture is possible with Solr? Obviously
I am not looking for an out-of-the-.box solution, I just need to
understand what I have to develop myself and what is already available.

1) is a multicore architecture: I know it is possible and I tested that it
works great
2) same as above, no problems here :-)
3) I want to hide the different cores to the client application; the
client application should send the requests to one guy that parses the
request and forwards it to the cores. Is this a custom RequestHandler? Any
link (to the Wiki?) to understand better? Or is there anything already
available to achieve this?
4) The guy that parses the request and forwards it to the cores shall
aggregate and return results in a common XML format: is this a custom
ResponseHandler?
5) I know this is just my business logic :-)

Any thougts/warning/advice about this?

Thanks a lot in advance!
Giovanni


Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote:


 Shalin, thanks a lot. One quick question:

 Now, after putting the query in the way, you suggested, I am getting:

 - lst name=facet_counts
 - lst name=facet_queries
  int name=productPrice_product_s:[0 TO 20]23315/int
  /lst
  lst name=facet_fields /
  lst name=facet_dates /
  /lst

 But it is not returning me records. Do I need to enter this field entry in
 schema.xml to get the records or anywhere else.


facet.query returns the number of documents matching that query after
applying any filters (fq) that you may have specified.

Can you tell us your use-case?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Issue with Facet Query

2009-03-20 Thread dabboo

Thanks Shalin, thanks a lot. I appreciate your help in resolving this issue.

Thanks,
Amit

Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote:
 

 Shalin, thanks a lot. One quick question:

 Now, after putting the query in the way, you suggested, I am getting:

 - lst name=facet_counts
 - lst name=facet_queries
  int name=productPrice_product_s:[0 TO 20]23315/int
  /lst
  lst name=facet_fields /
  lst name=facet_dates /
  /lst

 But it is not returning me records. Do I need to enter this field entry
 in
 schema.xml to get the records or anywhere else.

 
 facet.query returns the number of documents matching that query after
 applying any filters (fq) that you may have specified.
 
 Can you tell us your use-case?
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22616724.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Special character indexing

2009-03-20 Thread Gargate, Siddharth
Hi Shalin,
Thanks for the suggestion. I tried following code, (not sure about the 
exact usage)

CommonsHttpSolrServer ess = new 
CommonsHttpSolrServer(http://localhost:8983/solr;);
ess.setRequestWriter(new BinaryRequestWriter());
SolrInputDocument solrdoc = new SolrInputDocument();
solrdoc.addField(id, Kimi);
solrdoc.addField(name, 03 Kimi Räikkönen );
ess.add(solrdoc);

But got following exception on the server

WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: 
wt=javabin
  If you are using solrj, make sure to register a request handler to /update 
rather then use this servlet.
  Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler  to 
your solrconfig.xml


Mar 20, 2009 3:14:48 PM org.apache.solr.common.SolrException log
SEVERE: Error processing legacy update 
command:com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-
CHAR, code 1))
 at [row,col {unknown-source}]: [1,1]
at 
com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675)
at 
com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:660)
at 
com.ctc.wstx.sr.BasicStreamReader.readSpacePrimary(BasicStreamReader.java:4916)
at 
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2003)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdateRequestHandler.java:393)
at 
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:78)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1098)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:295)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

Thanks in advance for help.
Siddharth

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Friday, March 20, 2009 10:35 AM
To: solr-user@lucene.apache.org
Subject: Re: Special character indexing

On Fri, Mar 20, 2009 at 10:17 AM, Gargate, Siddharth sgarg...@ptc.comwrote:

 I tried with Jetty but the same issue. Just a guess, but looks like 
 the fix for SOLR-973 might have introduced this issue.


I'm not sure how SOLR-973 can cause this issue. Can you try using the 
BinaryRequestWriter and see if it succeeds?

http://wiki.apache.org/solr/Solrj#head-ddc28af4033350481a3cbb27bc1d25bffd801af0

--
Regards,
Shalin Shekhar Mangar.


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Óscar Marín Miró
Hi,

My guess is that *although* your DB is in UTF-8, the database engine sends
you the rows in ISO-Latin1, so before doing *anything* after receiving the
data, you should transcode from ISO-Latin1 to UTF-8 and then send that to
SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with
any row:

$row=decode(iso-8859-1,$row);

... and before building the xml to invoque and add document to SolR:

$row=encode(utf8,$row);

On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote:


 I add :
 è = e to mapping-ISOLatin1Accent.txt

 and add the following fieldType:

 fieldType name=textCharNorm class=solr.TextField
 positionIncrementGap=100 
  analyzer
charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
  /analyzer
 /fieldType

 By still have the same probleme ! it's only work when i store ISO string
 into UTF-8 data base (ex: store solène not solène) :,(




 aerox7 wrote:
 
  == where are you seeing it as Solène as opposed to the
  correct way of solène?
 
  I have Solène in my Mysql DATA BASE ! so i don't know if this is
  correct or not ? i gess that Solène is solène in UTF-8 ?!
 
  I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
  when i try with solène everything is ok ! but when i try with Solène
  (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne
  !!!
 
  I think that ISOLatin1AccentFilterFactory take only string with Charset
  ISO-8859-1 .
 
  So any solution to transform my string to ISO-8859-1 before indexing
  process. May be by creating transformer in DataImportHandler ? (Never
 code
  in java :( )
 
  Thank you all.
 
 
  Koji Sekiguchi-2 wrote:
 
  aerox7 wrote:
  Hi,
  I have a mysql data base in UTF-8. I have a row with Solène
 (solène).
  I
  want to transforme this to solene, so i use Solr
  ISOLatin1AccentFilterFactory to perform this task but it dosn't work
 ?!!
 
  i gess that Solène is solène in UTF-8 ?! i also set tomcat to
 utf-8
  so
  normaly ISOLatin1AccentFilterFactory have to replace the accent ...
 
  any ideas ?
 
  i use DataImportHandler.
 
 
  If a mapping rule è to e is always true in your field, you can try
  to use MappingCharFilter
  instead of ISOLatin1AccentFilter. Add the following line to
  mapping-ISOLatin1Accent.txt:
 
  è = e
 
  and add the following fieldType:
 
  fieldType name=textCharNorm class=solr.TextField
  positionIncrementGap=100 
analyzer
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
/analyzer
  /fieldType
 
  MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.
 
  Koji
 
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
“I may not believe in myself, but I believe in what I'm doing.”

-- Jimmy Page


Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 1:49 PM, dabboo ag...@sapient.com wrote:


 Thanks a lot for this information. But is there any way, I can impose the
 range on the facet.
 for e.g. If I want to search the data between a specific range, how should
 I
 form my query.


Use a filter query, fq=productPrice_product_str_s:[0 TO 20]

-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

I add :
è = e to mapping-ISOLatin1Accent.txt 

and add the following fieldType: 

fieldType name=textCharNorm class=solr.TextField 
positionIncrementGap=100  
  analyzer 
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/ 
tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ 
  /analyzer 
/fieldType 

By still have the same probleme ! it's only work when i store ISO string
into UTF-8 data base (ex: store solène not solène) :,(




aerox7 wrote:
 
 == where are you seeing it as Solène as opposed to the   
 correct way of solène? 
 
 I have Solène in my Mysql DATA BASE ! so i don't know if this is
 correct or not ? i gess that Solène is solène in UTF-8 ?!
 
 I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
 when i try with solène everything is ok ! but when i try with Solène
 (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne
 !!!
 
 I think that ISOLatin1AccentFilterFactory take only string with Charset
 ISO-8859-1 .
 
 So any solution to transform my string to ISO-8859-1 before indexing
 process. May be by creating transformer in DataImportHandler ? (Never code
 in java :( )
 
 Thank you all.
 
 
 Koji Sekiguchi-2 wrote:
 
 aerox7 wrote:
 Hi,
 I have a mysql data base in UTF-8. I have a row with Solène (solène).
 I
 want to transforme this to solene, so i use Solr
 ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!!

 i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8
 so
 normaly ISOLatin1AccentFilterFactory have to replace the accent ...

 any ideas ?

 i use DataImportHandler.
   
 
 If a mapping rule è to e is always true in your field, you can try 
 to use MappingCharFilter
 instead of ISOLatin1AccentFilter. Add the following line to 
 mapping-ISOLatin1Accent.txt:
 
 è = e
 
 and add the following fieldType:
 
 fieldType name=textCharNorm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 
 MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.
 
 Koji
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
Sent from the Solr - User mailing list archive at Nabble.com.



how can I check field which are indexed but not stored?

2009-03-20 Thread sunnyfr

Hi

I've an issue, I've some data which come up but I've applied a filtre on it
and it shouldnt, when I check in my database mysql I've obviously the
document which has been updated so I will like to see how it is in solr.

if I do : /solr/video/select?q=id:8582006 I will just see field which has
been stored. Is there a way to see how data are indexed for other field of
my schema which are not stored but indexed.

Like a bit in the console dataimporthandler, which with verbose activated I
can see every field of my schema.

Otherwise what would you reckon in this case, a document which has not been
updated ? how can I sort it out?

Thanks a lot guys for your excellent help
-- 
View this message in context: 
http://www.nabble.com/how-can-I-check-field-which-are-indexed-but-not-stored--tp22617914p22617914.html
Sent from the Solr - User mailing list archive at Nabble.com.



FW: Special character indexing

2009-03-20 Thread Gargate, Siddharth
Thanks Shalin,

Adding BinaryUpdateRequestHandler solved the issue. Thank you very much. 

Just one query, shouldn't XmlUpdateRequestHandler also work for these 
characters? I saw another user mentioning the same issue and it was working 
with DirectXmlRequest. 



-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Friday, March 20, 2009 3:58 PM
To: solr-user@lucene.apache.org
Subject: Re: Special character indexing

On Fri, Mar 20, 2009 at 3:19 PM, Gargate, Siddharth sgarg...@ptc.comwrote:

 Hi Shalin,
Thanks for the suggestion. I tried following code, (not sure 
 about the exact usage)

CommonsHttpSolrServer ess = new CommonsHttpSolrServer(
 http://localhost:8983/solr;);
ess.setRequestWriter(new BinaryRequestWriter());
SolrInputDocument solrdoc = new SolrInputDocument();
solrdoc.addField(id, Kimi);
solrdoc.addField(name, 03 Kimi Räikkönen );
ess.add(solrdoc);

 But got following exception on the server

 WARNING: The @Deprecated SolrUpdateServlet does not accept query
 parameters: wt=javabin
  If you are using solrj, make sure to register a request handler to 
 /update rather then use this servlet.
  Add: requestHandler name=/update 
 class=solr.XmlUpdateRequestHandler  to your solrconfig.xml


Yes, you need to add the following to your solrconfig.xml

requestHandler name=/update/javabin
class=solr.BinaryUpdateRequestHandler /

--
Regards,
Shalin Shekhar Mangar.


Re: alternative lucene directories support

2009-03-20 Thread Andrey Klochkov
Otis,

The fact is that some code instantiates FSDirectory indirectly by using
deprecated constructors.
I provided a patch here https://issues.apache.org/jira/browse/SOLR-465 but I
don't have rights to re-open the issue.

Also there is logic in Solr which is tied to file system usage even if file
system index is not used:
- SolrCore chechs index existence by looking at file system directory
existense which is incorrect in the case of non-fs directory
- Spell checker has FSDirectory hard-code

IMO this code is ought to be changed too.

I'm ready to contribute all these changes if it's appropriate. Is it better
to write to dev maillist for that?

On Thu, Mar 19, 2009 at 8:58 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:


 My quick grep of the sources and scan of the results doesn't see any
 problematic areas, but if you see some places that still need a fix, yes,
 please reopen the issue and submit the patch.  Do you also plan on
 submitting the actual alternative Directory impl?

 $ ffjg FSDire | egrep 'SolrIndexW|SolrCore|UpdateH'
 ./src/java/org/apache/solr/core/SolrCore.java:import
 org.apache.lucene.store.FSDirectory;
 ./src/java/org/apache/solr/core/SolrCore.java://return new
 SolrIndexSearcher(this, schema, main,
 IndexReader.open(FSDirectory.getDirectory(getIndexDir()), readOnly), true,
 false);

 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Andrey Klochkov akloch...@griddynamics.com
  To: solr-user@lucene.apache.org
  Sent: Thursday, March 19, 2009 10:22:57 AM
  Subject: alternative lucene directories support
 
  Hi all
 
  We want to use Solr with lucene Directory implementation which places
 index
  into Coherence data grid.
  I fact I managed to run Solr in such configuration although I had to
 patch
  it.
  I think that the issue about alternate directories support (SOLR-465)
 should
  be re-opened because there are some places in source code where
 FSDirectory
  hard-coding is still present (SolrCore, SolrIndexWriter and
 UpdateHandler).
  I can provide a patch to fix it.
 
  WDYT?
 
  --
  Andrew Klochkov




-- 
Andrew Klochkov


Facet Query Results Issue

2009-03-20 Thread dabboo

Hi,

this is my facet query.

facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]
 

This is my query and these are results, I am getting: 

int name=100202/int 
  int name=1057/int 
  int name=10.614/int 
  int name=10.211/int 
  int name=10.6710/int 
  int name=10.89/int 
  int name=10.999/int 
  int name=1.337/int 
  int name=16/int 
  int name=10.45/int 
  int name=10.344/int 
  int name=1.012/int 
  int name=1.22/int 
  int name=1.662/int 
  int name=10.632/int 
  int name=10.662/int 
  int name=1.41/int 
  int name=1.71/int 
  int name=1.81/int 
  int name=10.331/int 
  int name=10.751/int 
  int name=10.91/int 
  int name=.010/int 
  int name=.20/int 
  int name=100.050/int 
  int name=100.070/int 
  int name=100.130/int 
  int name=100.20/int 
  int name=100.250/int 
  int name=100.330/int 
  int name=100.40/int 
  int name=100.450/int 
  int name=100.530/int 
  int name=100.60/int 
  int name=100.670/int 
  int name=100.730/int 
  int name=100.80/int 
  int name=100.870/int 
  int name=100.950/int 
  int name=100.960/int 
  int name=1010/int 
  int name=101.10/int 
  int name=101.130/int 
  int name=101.20/int 
  int name=101.270/int 
  int name=101.330/int 
  int name=101.40/int 
  int name=101.470/int 
  int name=101.60/int 
  int name=101.670/int 
  int name=101.730/int 
  int name=101.80/int 
  int name=101.870/int 
  int name=1020/int 
  int name=102.070/int 
  int name=102.190/int 
  int name=102.20/int 
  int name=102.270/int 
  int name=102.330/int 
  int name=102.40/int 
  int name=102.530/int 
  int name=102.60/int 
  int name=102.670/int 
  int name=102.80/int 
  int name=102.870/int 
  int name=102.930/int 
  int name=1022.40/int 
  int name=1030/int 

It is only returning results, which are having values started with 2, 3, 4
or some other integer instead of only 1. It is not returning records in
which value is 10 and 100. 

Please suggest. 

thanks, 
Amit 


-- 
View this message in context: 
http://www.nabble.com/Facet-Query-Results-Issue-tp22617883p22617883.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Facet Query

2009-03-20 Thread dabboo

Hi Shalin,

One more thing, 

facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]

This is my query and these are results, I am getting:

int name=100202/int 
  int name=1057/int 
  int name=10.614/int 
  int name=10.211/int 
  int name=10.6710/int 
  int name=10.89/int 
  int name=10.999/int 
  int name=1.337/int 
  int name=16/int 
  int name=10.45/int 
  int name=10.344/int 
  int name=1.012/int 
  int name=1.22/int 
  int name=1.662/int 
  int name=10.632/int 
  int name=10.662/int 
  int name=1.41/int 
  int name=1.71/int 
  int name=1.81/int 
  int name=10.331/int 
  int name=10.751/int 
  int name=10.91/int 
  int name=.010/int 
  int name=.20/int 
  int name=100.050/int 
  int name=100.070/int 
  int name=100.130/int 
  int name=100.20/int 
  int name=100.250/int 
  int name=100.330/int 
  int name=100.40/int 
  int name=100.450/int 
  int name=100.530/int 
  int name=100.60/int 
  int name=100.670/int 
  int name=100.730/int 
  int name=100.80/int 
  int name=100.870/int 
  int name=100.950/int 
  int name=100.960/int 
  int name=1010/int 
  int name=101.10/int 
  int name=101.130/int 
  int name=101.20/int 
  int name=101.270/int 
  int name=101.330/int 
  int name=101.40/int 
  int name=101.470/int 
  int name=101.60/int 
  int name=101.670/int 
  int name=101.730/int 
  int name=101.80/int 
  int name=101.870/int 
  int name=1020/int 
  int name=102.070/int 
  int name=102.190/int 
  int name=102.20/int 
  int name=102.270/int 
  int name=102.330/int 
  int name=102.40/int 
  int name=102.530/int 
  int name=102.60/int 
  int name=102.670/int 
  int name=102.80/int 
  int name=102.870/int 
  int name=102.930/int 
  int name=1022.40/int 
  int name=1030/int 

It is only returning results, which are having values started with 2, 3, 4
or some other integer instead of only 1. It is not returning records in
which value is 10 and 100.

Please suggest.

thanks,
Amit




dabboo wrote:
 
 Thanks Shalin, thanks a lot. I appreciate your help in resolving this
 issue.
 
 Thanks,
 Amit
 
 Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote:
 

 Shalin, thanks a lot. One quick question:

 Now, after putting the query in the way, you suggested, I am getting:

 - lst name=facet_counts
 - lst name=facet_queries
  int name=productPrice_product_s:[0 TO 20]23315/int
  /lst
  lst name=facet_fields /
  lst name=facet_dates /
  /lst

 But it is not returning me records. Do I need to enter this field entry
 in
 schema.xml to get the records or anywhere else.

 
 facet.query returns the number of documents matching that query after
 applying any filters (fq) that you may have specified.
 
 Can you tell us your use-case?
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22617745.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Special character indexing

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 3:19 PM, Gargate, Siddharth sgarg...@ptc.comwrote:

 Hi Shalin,
Thanks for the suggestion. I tried following code, (not sure about
 the exact usage)

CommonsHttpSolrServer ess = new CommonsHttpSolrServer(
 http://localhost:8983/solr;);
ess.setRequestWriter(new BinaryRequestWriter());
SolrInputDocument solrdoc = new SolrInputDocument();
solrdoc.addField(id, Kimi);
solrdoc.addField(name, 03 Kimi Räikkönen );
ess.add(solrdoc);

 But got following exception on the server

 WARNING: The @Deprecated SolrUpdateServlet does not accept query
 parameters: wt=javabin
  If you are using solrj, make sure to register a request handler to /update
 rather then use this servlet.
  Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler 
 to your solrconfig.xml


Yes, you need to add the following to your solrconfig.xml

requestHandler name=/update/javabin
class=solr.BinaryUpdateRequestHandler /

-- 
Regards,
Shalin Shekhar Mangar.


Re: how can I check field which are indexed but not stored?

2009-03-20 Thread Markus Jelsma - Buyways B.V.


On Fri, 2009-03-20 at 03:41 -0700, sunnyfr wrote:

 Hi
 
 I've an issue, I've some data which come up but I've applied a filtre on it
 and it shouldnt, when I check in my database mysql I've obviously the
 document which has been updated so I will like to see how it is in solr.
 
 if I do : /solr/video/select?q=id:8582006 I will just see field which has
 been stored. Is there a way to see how data are indexed for other field of
 my schema which are not stored but indexed.


/solr/admin/luke
will show you a lot of information concering stored and indexed fields.

Hope this is what you meant.


 
 Like a bit in the console dataimporthandler, which with verbose activated I
 can see every field of my schema.
 
 Otherwise what would you reckon in this case, a document which has not been
 updated ? how can I sort it out?
 
 Thanks a lot guys for your excellent help


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

I'm using DataImportHandler to send my data to Solr ! so you mean it possible
to apply a transformer in db-config.xml with a perl script ?


Óscar Marín Miró wrote:
 
 Hi,
 
 My guess is that *although* your DB is in UTF-8, the database engine sends
 you the rows in ISO-Latin1, so before doing *anything* after receiving the
 data, you should transcode from ISO-Latin1 to UTF-8 and then send that to
 SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do
 with
 any row:
 
 $row=decode(iso-8859-1,$row);
 
 ... and before building the xml to invoque and add document to SolR:
 
 $row=encode(utf8,$row);
 
 On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote:
 

 I add :
 è = e to mapping-ISOLatin1Accent.txt

 and add the following fieldType:

 fieldType name=textCharNorm class=solr.TextField
 positionIncrementGap=100 
  analyzer
charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
  /analyzer
 /fieldType

 By still have the same probleme ! it's only work when i store ISO string
 into UTF-8 data base (ex: store solène not solène) :,(




 aerox7 wrote:
 
  == where are you seeing it as Solène as opposed to the
  correct way of solène?
 
  I have Solène in my Mysql DATA BASE ! so i don't know if this is
  correct or not ? i gess that Solène is solène in UTF-8 ?!
 
  I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp,
 so
  when i try with solène everything is ok ! but when i try with Solène
  (like what i have in DB) analysis convert à in A delete ¨ so i get
 SolAne
  !!!
 
  I think that ISOLatin1AccentFilterFactory take only string with Charset
  ISO-8859-1 .
 
  So any solution to transform my string to ISO-8859-1 before indexing
  process. May be by creating transformer in DataImportHandler ? (Never
 code
  in java :( )
 
  Thank you all.
 
 
  Koji Sekiguchi-2 wrote:
 
  aerox7 wrote:
  Hi,
  I have a mysql data base in UTF-8. I have a row with Solène
 (solène).
  I
  want to transforme this to solene, so i use Solr
  ISOLatin1AccentFilterFactory to perform this task but it dosn't work
 ?!!
 
  i gess that Solène is solène in UTF-8 ?! i also set tomcat to
 utf-8
  so
  normaly ISOLatin1AccentFilterFactory have to replace the accent
 ...
 
  any ideas ?
 
  i use DataImportHandler.
 
 
  If a mapping rule è to e is always true in your field, you can
 try
  to use MappingCharFilter
  instead of ISOLatin1AccentFilter. Add the following line to
  mapping-ISOLatin1Accent.txt:
 
  è = e
 
  and add the following fieldType:
 
  fieldType name=textCharNorm class=solr.TextField
  positionIncrementGap=100 
analyzer
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer
 class=solr.CharStreamAwareWhitespaceTokenizerFactory/
/analyzer
  /fieldType
 
  MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly
 build.
 
  Koji
 
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 “I may not believe in myself, but I believe in what I'm doing.”
 
 -- Jimmy Page
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: FW: Special character indexing

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 4:13 PM, Gargate, Siddharth sgarg...@ptc.comwrote:

 Thanks Shalin,

 Adding BinaryUpdateRequestHandler solved the issue. Thank you very much.

 Just one query, shouldn't XmlUpdateRequestHandler also work for these
 characters? I saw another user mentioning the same issue and it was working
 with DirectXmlRequest.


It should. I'll run a few tests to see where is the problem.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote:


 Hi Shalin,

 One more thing,


 facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]

 This is my query and these are results, I am getting:

 int name=100202/int
  int name=1057/int
  int name=10.614/int
  int name=10.211/int
  int name=10.6710/int
  int name=10.89/int
  int name=10.999/int
  int name=1.337/int
  int name=16/int
  int name=10.45/int
  int name=10.344/int
  int name=1.012/int
  int name=1.22/int
  int name=1.662/int
  int name=10.632/int
  int name=10.662/int
  int name=1.41/int
  int name=1.71/int
  int name=1.81/int
  int name=10.331/int
  int name=10.751/int
  int name=10.91/int
  int name=.010/int
  int name=.20/int
  int name=100.050/int
  int name=100.070/int
  int name=100.130/int
  int name=100.20/int
  int name=100.250/int
  int name=100.330/int
  int name=100.40/int
  int name=100.450/int
  int name=100.530/int
  int name=100.60/int
  int name=100.670/int
  int name=100.730/int
  int name=100.80/int
  int name=100.870/int
  int name=100.950/int
  int name=100.960/int
  int name=1010/int
  int name=101.10/int
  int name=101.130/int
  int name=101.20/int
  int name=101.270/int
  int name=101.330/int
  int name=101.40/int
  int name=101.470/int
  int name=101.60/int
  int name=101.670/int
  int name=101.730/int
  int name=101.80/int
  int name=101.870/int
  int name=1020/int
  int name=102.070/int
  int name=102.190/int
  int name=102.20/int
  int name=102.270/int
  int name=102.330/int
  int name=102.40/int
  int name=102.530/int
  int name=102.60/int
  int name=102.670/int
  int name=102.80/int
  int name=102.870/int
  int name=102.930/int
  int name=1022.40/int
  int name=1030/int

 It is only returning results, which are having values started with 2, 3, 4
 or some other integer instead of only 1. It is not returning records in
 which value is 10 and 100.


Please do not send a duplicate mails. It will not help you get an answer
faster.

If you need to filter results to a specific range then you should use filter
queries through the fq parameter:

fq=productPrice_product_str_s:[0%20TO%20100]

-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Óscar Marín Miró
What I mean is that unless solène travels to Solr in strict UTF-8,
mapping-ISOLatin1Accent won't do anything, and posibly your DB query returns
data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless you
transcode your data from Latin1 to UTF8 before sending it to SolR,
mapping-ISOLatin1Accent won't know how to interpret it.

Does it make any sense? :P

On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote:


 I'm using DataImportHandler to send my data to Solr ! so you mean it
 possible
 to apply a transformer in db-config.xml with a perl script ?


 Óscar Marín Miró wrote:
 
  Hi,
 
  My guess is that *although* your DB is in UTF-8, the database engine
 sends
  you the rows in ISO-Latin1, so before doing *anything* after receiving
 the
  data, you should transcode from ISO-Latin1 to UTF-8 and then send that to
  SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do
  with
  any row:
 
  $row=decode(iso-8859-1,$row);
 
  ... and before building the xml to invoque and add document to SolR:
 
  $row=encode(utf8,$row);
 
  On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote:
 
 
  I add :
  è = e to mapping-ISOLatin1Accent.txt
 
  and add the following fieldType:
 
  fieldType name=textCharNorm class=solr.TextField
  positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
   /analyzer
  /fieldType
 
  By still have the same probleme ! it's only work when i store ISO string
  into UTF-8 data base (ex: store solène not solène) :,(
 
 
 
 
  aerox7 wrote:
  
   == where are you seeing it as Solène as opposed to the
   correct way of solène?
  
   I have Solène in my Mysql DATA BASE ! so i don't know if this is
   correct or not ? i gess that Solène is solène in UTF-8 ?!
  
   I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp,
  so
   when i try with solène everything is ok ! but when i try with Solène
   (like what i have in DB) analysis convert à in A delete ¨ so i get
  SolAne
   !!!
  
   I think that ISOLatin1AccentFilterFactory take only string with
 Charset
   ISO-8859-1 .
  
   So any solution to transform my string to ISO-8859-1 before indexing
   process. May be by creating transformer in DataImportHandler ? (Never
  code
   in java :( )
  
   Thank you all.
  
  
   Koji Sekiguchi-2 wrote:
  
   aerox7 wrote:
   Hi,
   I have a mysql data base in UTF-8. I have a row with Solène
  (solène).
   I
   want to transforme this to solene, so i use Solr
   ISOLatin1AccentFilterFactory to perform this task but it dosn't work
  ?!!
  
   i gess that Solène is solène in UTF-8 ?! i also set tomcat to
  utf-8
   so
   normaly ISOLatin1AccentFilterFactory have to replace the accent
  ...
  
   any ideas ?
  
   i use DataImportHandler.
  
  
   If a mapping rule è to e is always true in your field, you can
  try
   to use MappingCharFilter
   instead of ISOLatin1AccentFilter. Add the following line to
   mapping-ISOLatin1Accent.txt:
  
   è = e
  
   and add the following fieldType:
  
   fieldType name=textCharNorm class=solr.TextField
   positionIncrementGap=100 
 analyzer
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer
  class=solr.CharStreamAwareWhitespaceTokenizerFactory/
 /analyzer
   /fieldType
  
   MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly
  build.
  
   Koji
  
  
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  “I may not believe in myself, but I believe in what I'm doing.”
 
  -- Jimmy Page
 
 

 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
“I may not believe in myself, but I believe in what I'm doing.”

-- Jimmy Page


Re: Issue with Facet Query

2009-03-20 Thread dabboo

I am using this query only but I am getting the same results. 


facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100]


- lst name=facet_fields
- lst name=productPrice_product_str_s
  int name=100202/int 
  int name=1057/int 
  int name=10.614/int 
  int name=10.211/int 
  int name=10.6710/int 
  int name=10.89/int 
  int name=10.999/int 
  int name=1.337/int 
  int name=16/int 
  int name=10.45/int 
  int name=10.344/int 
  int name=1.012/int 
  int name=1.22/int 
  int name=1.662/int 
  int name=10.632/int 
  int name=10.662/int 
  int name=1.41/int 
  int name=1.71/int 
  int name=1.81/int 
  int name=10.331/int 
  int name=10.751/int 
  int name=10.91/int 
  int name=.010/int 
  int name=.20/int 
  int name=00/int 
  int name=100.050/int 
  int name=100.070/int 
  int name=100.130/int 
  int name=100.20/int 
  int name=100.250/int 
  int name=100.330/int 
  int name=100.40/int 
  int name=100.450/int 
  int name=100.530/int 
  int name=100.60/int 
  int name=100.670/int 
  int name=100.730/int 
  int name=100.80/int 
  int name=100.870/int 
  int name=100.950/int 
  int name=100.960/int 
  int name=1010/int 
  int name=101.10/int 
  int name=101.130/int 
  int name=101.20/int 
  int name=101.270/int 
  int name=101.330/int 
  int name=101.40/int 
  int name=101.470/int 
  int name=101.60/int 
  int name=101.670/int 
  int name=101.730/int 
  int name=101.80/int 
  int name=101.870/int 

It still is not showing up the other values. Do I need to make any entry in
schema or solrConfig xml files. Do I need to convert the string into numeric
values etc etc.

Please suggest.

Thanks,
Amit


Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote:
 

 Hi Shalin,

 One more thing,


 facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]

 This is my query and these are results, I am getting:

 int name=100202/int
  int name=1057/int
  int name=10.614/int
  int name=10.211/int
  int name=10.6710/int
  int name=10.89/int
  int name=10.999/int
  int name=1.337/int
  int name=16/int
  int name=10.45/int
  int name=10.344/int
  int name=1.012/int
  int name=1.22/int
  int name=1.662/int
  int name=10.632/int
  int name=10.662/int
  int name=1.41/int
  int name=1.71/int
  int name=1.81/int
  int name=10.331/int
  int name=10.751/int
  int name=10.91/int
  int name=.010/int
  int name=.20/int
  int name=100.050/int
  int name=100.070/int
  int name=100.130/int
  int name=100.20/int
  int name=100.250/int
  int name=100.330/int
  int name=100.40/int
  int name=100.450/int
  int name=100.530/int
  int name=100.60/int
  int name=100.670/int
  int name=100.730/int
  int name=100.80/int
  int name=100.870/int
  int name=100.950/int
  int name=100.960/int
  int name=1010/int
  int name=101.10/int
  int name=101.130/int
  int name=101.20/int
  int name=101.270/int
  int name=101.330/int
  int name=101.40/int
  int name=101.470/int
  int name=101.60/int
  int name=101.670/int
  int name=101.730/int
  int name=101.80/int
  int name=101.870/int
  int name=1020/int
  int name=102.070/int
  int name=102.190/int
  int name=102.20/int
  int name=102.270/int
  int name=102.330/int
  int name=102.40/int
  int name=102.530/int
  int name=102.60/int
  int name=102.670/int
  int name=102.80/int
  int name=102.870/int
  int name=102.930/int
  int name=1022.40/int
  int name=1030/int

 It is only returning results, which are having values started with 2, 3,
 4
 or some other integer instead of only 1. It is not returning records in
 which value is 10 and 100.

 
 Please do not send a duplicate mails. It will not help you get an answer
 faster.
 
 If you need to filter results to a specific range then you should use
 filter
 queries through the fq parameter:
 
 fq=productPrice_product_str_s:[0%20TO%20100]
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
And you'll need to re-index once you make the schema change.

On Fri, Mar 20, 2009 at 5:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 What is the type of the productPrice_product_str field? I'm guessing that
 it is a string type.

 Since it is a float value and you need range search, you should change this
 to a 'sfloat' or 'sdouble' in your schema.xml


 On Fri, Mar 20, 2009 at 5:11 PM, dabboo ag...@sapient.com wrote:


 I am using this query only but I am getting the same results.



 facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100]


 - lst name=facet_fields
 - lst name=productPrice_product_str_s
   int name=100202/int
  int name=1057/int
  int name=10.614/int
  int name=10.211/int
  int name=10.6710/int
  int name=10.89/int
  int name=10.999/int
  int name=1.337/int
  int name=16/int
  int name=10.45/int
  int name=10.344/int
  int name=1.012/int
  int name=1.22/int
  int name=1.662/int
  int name=10.632/int
  int name=10.662/int
  int name=1.41/int
  int name=1.71/int
  int name=1.81/int
  int name=10.331/int
  int name=10.751/int
  int name=10.91/int
  int name=.010/int
  int name=.20/int
   int name=00/int
   int name=100.050/int
  int name=100.070/int
  int name=100.130/int
  int name=100.20/int
  int name=100.250/int
  int name=100.330/int
  int name=100.40/int
  int name=100.450/int
  int name=100.530/int
  int name=100.60/int
  int name=100.670/int
  int name=100.730/int
  int name=100.80/int
  int name=100.870/int
  int name=100.950/int
  int name=100.960/int
  int name=1010/int
  int name=101.10/int
  int name=101.130/int
  int name=101.20/int
  int name=101.270/int
  int name=101.330/int
  int name=101.40/int
  int name=101.470/int
  int name=101.60/int
  int name=101.670/int
  int name=101.730/int
  int name=101.80/int
  int name=101.870/int

 It still is not showing up the other values. Do I need to make any entry
 in
 schema or solrConfig xml files. Do I need to convert the string into
 numeric
 values etc etc.

 Please suggest.

 Thanks,
 Amit


 Shalin Shekhar Mangar wrote:
 
  On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote:
 
 
  Hi Shalin,
 
  One more thing,
 
 
 
 facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]
 
  This is my query and these are results, I am getting:
 
  int name=100202/int
   int name=1057/int
   int name=10.614/int
   int name=10.211/int
   int name=10.6710/int
   int name=10.89/int
   int name=10.999/int
   int name=1.337/int
   int name=16/int
   int name=10.45/int
   int name=10.344/int
   int name=1.012/int
   int name=1.22/int
   int name=1.662/int
   int name=10.632/int
   int name=10.662/int
   int name=1.41/int
   int name=1.71/int
   int name=1.81/int
   int name=10.331/int
   int name=10.751/int
   int name=10.91/int
   int name=.010/int
   int name=.20/int
   int name=100.050/int
   int name=100.070/int
   int name=100.130/int
   int name=100.20/int
   int name=100.250/int
   int name=100.330/int
   int name=100.40/int
   int name=100.450/int
   int name=100.530/int
   int name=100.60/int
   int name=100.670/int
   int name=100.730/int
   int name=100.80/int
   int name=100.870/int
   int name=100.950/int
   int name=100.960/int
   int name=1010/int
   int name=101.10/int
   int name=101.130/int
   int name=101.20/int
   int name=101.270/int
   int name=101.330/int
   int name=101.40/int
   int name=101.470/int
   int name=101.60/int
   int name=101.670/int
   int name=101.730/int
   int name=101.80/int
   int name=101.870/int
   int name=1020/int
   int name=102.070/int
   int name=102.190/int
   int name=102.20/int
   int name=102.270/int
   int name=102.330/int
   int name=102.40/int
   int name=102.530/int
   int name=102.60/int
   int name=102.670/int
   int name=102.80/int
   int name=102.870/int
   int name=102.930/int
   int name=1022.40/int
   int name=1030/int
 
  It is only returning results, which are having values started with 2,
 3,
  4
  or some other integer instead of only 1. It is not returning records in
  which value is 10 and 100.
 
 
  Please do not send a duplicate mails. It will not help you get an answer
  faster.
 
  If you need to filter results to a specific range then you should use
  filter
  queries through the fq parameter:
 
  fq=productPrice_product_str_s:[0%20TO%20100]
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 

 --
 View this message in context:
 http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Óscar Marín Miró
A got you :)

Sorry. Correct, I use a Perl client. But sorry to say, I don't use
DataImportHandler. I just make the queries to the DB, filter the results,
and build the solr XML 'by hand' at the perl script :(

On Fri, Mar 20, 2009 at 1:04 PM, aerox7 amyne.berr...@me.com wrote:


 Yes ! i completely understand the problem. I'm just asking about your
 solution to resolvre this problem.

 I gess that you use Solar PERL Client to index your DATABASE. for my case i
 use DataImportHandler, so to only solution that i have with this is to
 create a transformer for DataImportHandler and try to convert my row from
 latin to UTF-8. (see

 http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9
 )

 So i just wanna know if you use DataImportHandler two with a perl script
 like a transformer ?


 Óscar Marín Miró wrote:
 
  What I mean is that unless solène travels to Solr in strict UTF-8,
  mapping-ISOLatin1Accent won't do anything, and posibly your DB query
  returns
  data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless
  you
  transcode your data from Latin1 to UTF8 before sending it to SolR,
  mapping-ISOLatin1Accent won't know how to interpret it.
 
  Does it make any sense? :P
 
  On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote:
 
 
  I'm using DataImportHandler to send my data to Solr ! so you mean it
  possible
  to apply a transformer in db-config.xml with a perl script ?
 
 
  Óscar Marín Miró wrote:
  
   Hi,
  
   My guess is that *although* your DB is in UTF-8, the database engine
  sends
   you the rows in ISO-Latin1, so before doing *anything* after receiving
  the
   data, you should transcode from ISO-Latin1 to UTF-8 and then send that
  to
   SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do
   with
   any row:
  
   $row=decode(iso-8859-1,$row);
  
   ... and before building the xml to invoque and add document to SolR:
  
   $row=encode(utf8,$row);
  
   On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com
 wrote:
  
  
   I add :
   è = e to mapping-ISOLatin1Accent.txt
  
   and add the following fieldType:
  
   fieldType name=textCharNorm class=solr.TextField
   positionIncrementGap=100 
analyzer
  charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
  tokenizer
 class=solr.CharStreamAwareWhitespaceTokenizerFactory/
/analyzer
   /fieldType
  
   By still have the same probleme ! it's only work when i store ISO
  string
   into UTF-8 data base (ex: store solène not solène) :,(
  
  
  
  
   aerox7 wrote:
   
== where are you seeing it as Solène as opposed to the
correct way of solène?
   
I have Solène in my Mysql DATA BASE ! so i don't know if this is
correct or not ? i gess that Solène is solène in UTF-8 ?!
   
I'vz tryed analysis in
  http://localhost:8983/solr/admin/analysis.jsp,
   so
when i try with solène everything is ok ! but when i try with
  Solène
(like what i have in DB) analysis convert à in A delete ¨ so i get
   SolAne
!!!
   
I think that ISOLatin1AccentFilterFactory take only string with
  Charset
ISO-8859-1 .
   
So any solution to transform my string to ISO-8859-1 before
 indexing
process. May be by creating transformer in DataImportHandler ?
  (Never
   code
in java :( )
   
Thank you all.
   
   
Koji Sekiguchi-2 wrote:
   
aerox7 wrote:
Hi,
I have a mysql data base in UTF-8. I have a row with Solène
   (solène).
I
want to transforme this to solene, so i use Solr
ISOLatin1AccentFilterFactory to perform this task but it dosn't
  work
   ?!!
   
i gess that Solène is solène in UTF-8 ?! i also set tomcat
 to
   utf-8
so
normaly ISOLatin1AccentFilterFactory have to replace the accent
   ...
   
any ideas ?
   
i use DataImportHandler.
   
   
If a mapping rule è to e is always true in your field, you
 can
   try
to use MappingCharFilter
instead of ISOLatin1AccentFilter. Add the following line to
mapping-ISOLatin1Accent.txt:
   
è = e
   
and add the following fieldType:
   
fieldType name=textCharNorm class=solr.TextField
positionIncrementGap=100 
  analyzer
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
tokenizer
   class=solr.CharStreamAwareWhitespaceTokenizerFactory/
  /analyzer
/fieldType
   
MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly
   build.
   
Koji
   
   
   
   
   
   
  
   --
   View this message in context:
  
 
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
   --
   “I may not believe in myself, but I believe in what I'm doing.”
  
   -- Jimmy Page
  
  
 
  --
  View this message in context:
 
 

Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Shalin Shekhar Mangar
On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote:


 Yes ! i completely understand the problem. I'm just asking about your
 solution to resolvre this problem.

 I gess that you use Solar PERL Client to index your DATABASE. for my case i
 use DataImportHandler, so to only solution that i have with this is to
 create a transformer for DataImportHandler and try to convert my row from
 latin to UTF-8. (see

 http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9
 )

 So i just wanna know if you use DataImportHandler two with a perl script
 like a transformer ?


No, but you can use any language which is available on the Java VM. For
example, Javascript (available by default on JDK6), JRuby, Jython, Groovy,
BeanShell etc.

But you may not need to do so much. Look at
http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/

-- 
Regards,
Shalin Shekhar Mangar.


Re: Facet Query Results Issue

2009-03-20 Thread Erik Hatcher


On Mar 20, 2009, at 6:39 AM, dabboo wrote:

this is my facet query.

facet
.field
=productPrice_product_str_sfacet.query=productPrice_product_str_s: 
[0%20TO%20100]


This is my query and these are results, I am getting:


It is only returning results, which are having values started with  
2, 3, 4
or some other integer instead of only 1. It is not returning records  
in

which value is 10 and 100.

Please suggest.


If you want the counts filtered, use fq (instead of or in addition to  
facet.query).  facet.query/facet.field are for generating counts for  
documents that match q/fq parameters, but do not themselves filter.


Erik



Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

My DATABASE is already in UTF-8 (Collation and Charset). 

I already set Tomcat connector to UTF-8, and Mysql default charset to
UTF-8 How to force mysql to send on UTF-8 (Or may be i have to do this
for TomCat ?)

i'm going crazy... :)


Shalin Shekhar Mangar wrote:
 
 On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote:
 

 Yes ! i completely understand the problem. I'm just asking about your
 solution to resolvre this problem.

 I gess that you use Solar PERL Client to index your DATABASE. for my case
 i
 use DataImportHandler, so to only solution that i have with this is to
 create a transformer for DataImportHandler and try to convert my row from
 latin to UTF-8. (see

 http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9
 )

 So i just wanna know if you use DataImportHandler two with a perl script
 like a transformer ?

 
 No, but you can use any language which is available on the Java VM. For
 example, Javascript (available by default on JDK6), JRuby, Jython, Groovy,
 BeanShell etc.
 
 But you may not need to do so much. Look at
 http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22619285.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Facet Query

2009-03-20 Thread Shalin Shekhar Mangar
What is the type of the productPrice_product_str field? I'm guessing that it
is a string type.

Since it is a float value and you need range search, you should change this
to a 'sfloat' or 'sdouble' in your schema.xml

On Fri, Mar 20, 2009 at 5:11 PM, dabboo ag...@sapient.com wrote:


 I am using this query only but I am getting the same results.



 facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100]


 - lst name=facet_fields
 - lst name=productPrice_product_str_s
   int name=100202/int
  int name=1057/int
  int name=10.614/int
  int name=10.211/int
  int name=10.6710/int
  int name=10.89/int
  int name=10.999/int
  int name=1.337/int
  int name=16/int
  int name=10.45/int
  int name=10.344/int
  int name=1.012/int
  int name=1.22/int
  int name=1.662/int
  int name=10.632/int
  int name=10.662/int
  int name=1.41/int
  int name=1.71/int
  int name=1.81/int
  int name=10.331/int
  int name=10.751/int
  int name=10.91/int
  int name=.010/int
  int name=.20/int
   int name=00/int
   int name=100.050/int
  int name=100.070/int
  int name=100.130/int
  int name=100.20/int
  int name=100.250/int
  int name=100.330/int
  int name=100.40/int
  int name=100.450/int
  int name=100.530/int
  int name=100.60/int
  int name=100.670/int
  int name=100.730/int
  int name=100.80/int
  int name=100.870/int
  int name=100.950/int
  int name=100.960/int
  int name=1010/int
  int name=101.10/int
  int name=101.130/int
  int name=101.20/int
  int name=101.270/int
  int name=101.330/int
  int name=101.40/int
  int name=101.470/int
  int name=101.60/int
  int name=101.670/int
  int name=101.730/int
  int name=101.80/int
  int name=101.870/int

 It still is not showing up the other values. Do I need to make any entry in
 schema or solrConfig xml files. Do I need to convert the string into
 numeric
 values etc etc.

 Please suggest.

 Thanks,
 Amit


 Shalin Shekhar Mangar wrote:
 
  On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote:
 
 
  Hi Shalin,
 
  One more thing,
 
 
 
 facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100]
 
  This is my query and these are results, I am getting:
 
  int name=100202/int
   int name=1057/int
   int name=10.614/int
   int name=10.211/int
   int name=10.6710/int
   int name=10.89/int
   int name=10.999/int
   int name=1.337/int
   int name=16/int
   int name=10.45/int
   int name=10.344/int
   int name=1.012/int
   int name=1.22/int
   int name=1.662/int
   int name=10.632/int
   int name=10.662/int
   int name=1.41/int
   int name=1.71/int
   int name=1.81/int
   int name=10.331/int
   int name=10.751/int
   int name=10.91/int
   int name=.010/int
   int name=.20/int
   int name=100.050/int
   int name=100.070/int
   int name=100.130/int
   int name=100.20/int
   int name=100.250/int
   int name=100.330/int
   int name=100.40/int
   int name=100.450/int
   int name=100.530/int
   int name=100.60/int
   int name=100.670/int
   int name=100.730/int
   int name=100.80/int
   int name=100.870/int
   int name=100.950/int
   int name=100.960/int
   int name=1010/int
   int name=101.10/int
   int name=101.130/int
   int name=101.20/int
   int name=101.270/int
   int name=101.330/int
   int name=101.40/int
   int name=101.470/int
   int name=101.60/int
   int name=101.670/int
   int name=101.730/int
   int name=101.80/int
   int name=101.870/int
   int name=1020/int
   int name=102.070/int
   int name=102.190/int
   int name=102.20/int
   int name=102.270/int
   int name=102.330/int
   int name=102.40/int
   int name=102.530/int
   int name=102.60/int
   int name=102.670/int
   int name=102.80/int
   int name=102.870/int
   int name=102.930/int
   int name=1022.40/int
   int name=1030/int
 
  It is only returning results, which are having values started with 2, 3,
  4
  or some other integer instead of only 1. It is not returning records in
  which value is 10 and 100.
 
 
  Please do not send a duplicate mails. It will not help you get an answer
  faster.
 
  If you need to filter results to a specific range then you should use
  filter
  queries through the fq parameter:
 
  fq=productPrice_product_str_s:[0%20TO%20100]
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 

 --
 View this message in context:
 http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

Yes ! i completely understand the problem. I'm just asking about your
solution to resolvre this problem.

I gess that you use Solar PERL Client to index your DATABASE. for my case i
use DataImportHandler, so to only solution that i have with this is to
create a transformer for DataImportHandler and try to convert my row from
latin to UTF-8. (see
http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9)
 

So i just wanna know if you use DataImportHandler two with a perl script
like a transformer ?


Óscar Marín Miró wrote:
 
 What I mean is that unless solène travels to Solr in strict UTF-8,
 mapping-ISOLatin1Accent won't do anything, and posibly your DB query
 returns
 data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless
 you
 transcode your data from Latin1 to UTF8 before sending it to SolR,
 mapping-ISOLatin1Accent won't know how to interpret it.
 
 Does it make any sense? :P
 
 On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote:
 

 I'm using DataImportHandler to send my data to Solr ! so you mean it
 possible
 to apply a transformer in db-config.xml with a perl script ?


 Óscar Marín Miró wrote:
 
  Hi,
 
  My guess is that *although* your DB is in UTF-8, the database engine
 sends
  you the rows in ISO-Latin1, so before doing *anything* after receiving
 the
  data, you should transcode from ISO-Latin1 to UTF-8 and then send that
 to
  SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do
  with
  any row:
 
  $row=decode(iso-8859-1,$row);
 
  ... and before building the xml to invoque and add document to SolR:
 
  $row=encode(utf8,$row);
 
  On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote:
 
 
  I add :
  è = e to mapping-ISOLatin1Accent.txt
 
  and add the following fieldType:
 
  fieldType name=textCharNorm class=solr.TextField
  positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
   /analyzer
  /fieldType
 
  By still have the same probleme ! it's only work when i store ISO
 string
  into UTF-8 data base (ex: store solène not solène) :,(
 
 
 
 
  aerox7 wrote:
  
   == where are you seeing it as Solène as opposed to the
   correct way of solène?
  
   I have Solène in my Mysql DATA BASE ! so i don't know if this is
   correct or not ? i gess that Solène is solène in UTF-8 ?!
  
   I'vz tryed analysis in
 http://localhost:8983/solr/admin/analysis.jsp,
  so
   when i try with solène everything is ok ! but when i try with
 Solène
   (like what i have in DB) analysis convert à in A delete ¨ so i get
  SolAne
   !!!
  
   I think that ISOLatin1AccentFilterFactory take only string with
 Charset
   ISO-8859-1 .
  
   So any solution to transform my string to ISO-8859-1 before indexing
   process. May be by creating transformer in DataImportHandler ?
 (Never
  code
   in java :( )
  
   Thank you all.
  
  
   Koji Sekiguchi-2 wrote:
  
   aerox7 wrote:
   Hi,
   I have a mysql data base in UTF-8. I have a row with Solène
  (solène).
   I
   want to transforme this to solene, so i use Solr
   ISOLatin1AccentFilterFactory to perform this task but it dosn't
 work
  ?!!
  
   i gess that Solène is solène in UTF-8 ?! i also set tomcat to
  utf-8
   so
   normaly ISOLatin1AccentFilterFactory have to replace the accent
  ...
  
   any ideas ?
  
   i use DataImportHandler.
  
  
   If a mapping rule è to e is always true in your field, you can
  try
   to use MappingCharFilter
   instead of ISOLatin1AccentFilter. Add the following line to
   mapping-ISOLatin1Accent.txt:
  
   è = e
  
   and add the following fieldType:
  
   fieldType name=textCharNorm class=solr.TextField
   positionIncrementGap=100 
 analyzer
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer
  class=solr.CharStreamAwareWhitespaceTokenizerFactory/
 /analyzer
   /fieldType
  
   MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly
  build.
  
   Koji
  
  
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  “I may not believe in myself, but I believe in what I'm doing.”
 
  -- Jimmy Page
 
 

 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 “I may not believe in myself, but I believe in what I'm doing.”
 
 -- Jimmy Page
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618999.html
Sent from the Solr - User mailing list archive at Nabble.com.



Error in identifying the primary key

2009-03-20 Thread radha c
Hi,

I am new to Solr. I am trying to index SQL table rows.
I am getting the below error. Can anyone help me in resolving this issue.

Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImporter
verifyWithSchema
INFO: id is a required field in SolrSchema . But not found in DataConfig
Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImportHandler
inform
SEVERE: Exception while loading DataImporter
org.apache.solr.handler.dataimport.DataImportHandlerException: There are
errors in the Schema
The field :age present in DataConfig does not have a counterpart in Solr
Schema
The field :firstname present in DataConfig does not have a counterpart in
Solr Schema
The field :lastName present in DataConfig does not have a counterpart in
Solr Schema

at
org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:108)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at org.apache.solr.core.SolrCore.init(SolrCore.java:571)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
at
org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831)
at
org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720)
at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490)
at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
at
org.apache.catalina.core.StandardService.start(StandardService.java:448)
at
org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
at org.apache.catalina.startup.Catalina.start(Catalina.java:552)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)
Mar 20, 2009 6:03:38 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
org.apache.solr.common.SolrException: FATAL: Could not create importer.
DataImporter config invalid
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:103)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at org.apache.solr.core.SolrCore.init(SolrCore.java:571)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)

Thanks


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Óscar Marín Miró
Hi,
Maybe this info is handy for you:

http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

The fact is Mysql can have UTF8 in its storage engine (or defined by
database), as you have, but the *connection* to the mysql client, can be set
to latin1.
In fact, here are my character_set variables:

character_set_client = latin1
character_set_connection = latin1
character_set_database = utf8
character_set_filesystem = binary
character_set_results = latin1
character_set_server = latin1
character_set_system = utf8
character_sets_dir = /usr/share/mysql/charsets/

As you see, the database is in utf8, *but* the client, connection, results
and server, expects latin1. You can see this variables through a mysql
console, just typing:

$ mysql -u user -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8114
Server version: 5.0.32-Debian_7etch5-log Debian etch distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql SHOW VARIABLES LIKE 'character_set%';
+--++
| Variable_name| Value  |
+--++
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results| latin1 |
| character_set_server | latin1 |
| character_set_system | utf8   |
| character_sets_dir   | /usr/share/mysql/charsets/ |
+--++
8 rows in set (0.00 sec)

and change them like this:

mysql SET character_set_client = utf8;
Query OK, 0 rows affected (0.00 sec)

mysql SHOW VARIABLES LIKE 'character_set%';
+--++
| Variable_name| Value  |
+--++
| character_set_client | utf8   |
| character_set_connection | latin1 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results| latin1 |
| character_set_server | latin1 |
| character_set_system | utf8   |
| character_sets_dir   | /usr/share/mysql/charsets/ |
+--++
8 rows in set (0.00 sec)

So... maybe after setting all variables that are set to latin1 to utf8 can
solve your problem? If they are set to latin1, of course ;)

If this is not the problem, hell, we escaped from work just for a few
minutes :P

On Fri, Mar 20, 2009 at 1:25 PM, aerox7 amyne.berr...@me.com wrote:


 My DATABASE is already in UTF-8 (Collation and Charset).

 I already set Tomcat connector to UTF-8, and Mysql default charset to
 UTF-8 How to force mysql to send on UTF-8 (Or may be i have to do this
 for TomCat ?)

 i'm going crazy... :)


 Shalin Shekhar Mangar wrote:
 
  On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote:
 
 
  Yes ! i completely understand the problem. I'm just asking about your
  solution to resolvre this problem.
 
  I gess that you use Solar PERL Client to index your DATABASE. for my
 case
  i
  use DataImportHandler, so to only solution that i have with this is to
  create a transformer for DataImportHandler and try to convert my row
 from
  latin to UTF-8. (see
 
 
 http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9
  )
 
  So i just wanna know if you use DataImportHandler two with a perl script
  like a transformer ?
 
 
  No, but you can use any language which is available on the Java VM. For
  example, Javascript (available by default on JDK6), JRuby, Jython,
 Groovy,
  BeanShell etc.
 
  But you may not need to do so much. Look at
 
 http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 

 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22619285.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
“I may not believe in myself, but I believe in what I'm doing.”

-- Jimmy Page


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Grant Ingersoll
Usually, when I see characters like this, it means you aren't viewing/ 
handling the UTF-8 correctly when bringing it into Java.  I would  
first check that your DB or JDBC driver is getting the chars out  
right.  It may even be the case that they did not go into the DB  
correctly in the first place.


On Mar 20, 2009, at 4:36 AM, aerox7 wrote:



== where are you seeing it as Solène as opposed to the
correct way of solène?

I have Solène in my Mysql DATA BASE ! so i don't know if this is  
correct

or not ? i gess that Solène is solène in UTF-8 ?!

I'vz tryed analysis in http://localhost:8983/solr/admin/ 
analysis.jsp, so
when i try with solène everything is ok ! but when i try with  
Solène (like
what i have in DB) analysis convert à in A delete ¨ so i get  
SolAne !!!


I think that ISOLatin1AccentFilterFactory take only string with  
Charset

ISO-8859-1 .

So any solution to transform my string to ISO-8859-1 before indexing
process. May be by creating transformer in DataImportHandler ?  
(Never code

in java :( )

Thank you all.


Koji Sekiguchi-2 wrote:


aerox7 wrote:

Hi,
I have a mysql data base in UTF-8. I have a row with  
Solène (solène).

I
want to transforme this to solene, so i use Solr
ISOLatin1AccentFilterFactory to perform this task but it dosn't  
work ?!!


i gess that Solène is solène in UTF-8 ?! i also set tomcat to  
utf-8

so
normaly ISOLatin1AccentFilterFactory have to replace the  
accent ...


any ideas ?

i use DataImportHandler.



If a mapping rule è to e is always true in your field, you can  
try

to use MappingCharFilter
instead of ISOLatin1AccentFilter. Add the following line to
mapping-ISOLatin1Accent.txt:

è = e

and add the following fieldType:

fieldType name=textCharNorm class=solr.TextField
positionIncrementGap=100 
 analyzer
   charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
   tokenizer  
class=solr.CharStreamAwareWhitespaceTokenizerFactory/

 /analyzer
/fieldType

MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly  
build.


Koji






--
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: solrj : probleme with utf-8 content

2009-03-20 Thread Pascal Dimassimo

Hi,

I have that problem to. But I notice that it only happens if I send my data
via solrj. If I send it via the solr-ruby gem, everything is fine
(http://wiki.apache.org/solr/solr-ruby).

Here is my jruby script:
---
require 'rubygems'

require 'solr'
require 'rexml/document'

include Java

def send_via_solrj(text, url)
  doc = org.apache.solr.common.SolrInputDocument.new
  doc.addField('id', '1')
  doc.addField('text', text)

  server = org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.new(url)
  server.add(doc);
  server.commit();
end

def send_via_gem(text, url)
  solr_doc = Solr::Document.new
  solr_doc['id'] = '2'
  solr_doc['text'] = text

  options = {
:autocommit = :on
  }

  conn = Solr::Connection.new(url, options)
  conn.add(solr_doc)
end

host = 'localhost'
port = ''
path = '/solr/core0'
url = http://#{host}:#{port}#{path};

text = eaiou with circumflexes: êâîôû

send_via_solrj(text, url)
send_via_gem(text, url)

puts done!
---

If I watch the http messages with tcpmon, I see that the data sent via solrj
is encoded in cp1252 while the data sent via the gem is utf-8.

Anyone has an idea of how we can configure sorlj to send in utf-8?

Thanks in advance.


Walid ABDELKABIR wrote:
 
 when executing this code I got in my index the field includes with this
 value : ?  ? ? :
 ---
 String content =eaiou with circumflexes: êâîôû;
 SolrInputDocument doc = new SolrInputDocument();
 doc.addField( id, 123, 1.0f );
 doc.addField( includes, content, 1.0f );
 server.add( doc );
 ---
 
 but this code works fine :
 
 ---
 String addContent =   adddoc boost=1.0
   +field name=id123/fieldfield
 name=includeseaiou with circumflexes:âîôû/field
   +/doc/add;
 DirectXmlRequest up = new DirectXmlRequest( /update, addContent );
 server.request( up );
 ---
 
 thanks for help
 
 

-- 
View this message in context: 
http://www.nabble.com/solrj-%3A-probleme-with-utf-8-content-tp22577377p22620317.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: delta-import commit=false doesn't seems to work

2009-03-20 Thread sunnyfr

Thanks I gave more information there :
http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html

thanks a lot Paul


Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
 sorry, the whole thing was commented . I did not notice that. I'll
 look into that
 
 2009/3/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 you have set autoCommit every x minutes . it must have invoked commit
 automatically


 On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote:

 Hi,

 Even if I hit command=delta-importcommit=falseoptimize=false
 I still have commit set in my logs and sometimes even optimize=true,

 About optimize I wonder if it comes from commitment too close and one is
 not
 done, but still I don't know really.

 Any idea?

 Thanks a lot,
 --
 View this message in context:
 http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul

 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html
Sent from the Solr - User mailing list archive at Nabble.com.



Unknown FieldType: 'string' used in QueryElevationComponent

2009-03-20 Thread radha c
Hi,

I am having below schema.xml, I did not define any string field. But I am
getting the below error when I start Tomcat,
Can anyone please suggest me what is the issue here.

WARNING: No queryConverter defined, using default converter
Mar 20, 2009 7:31:55 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to searc...@fe135d main
Mar 20, 2009 7:31:55 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
org.apache.solr.common.SolrException: Unknown FieldType: 'string' used in
QueryElevationComponent
at
org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:151)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at org.apache.solr.core.SolrCore.init(SolrCore.java:571)


schema name=example
 types
   fieldType name=text class=solr.TextField
positionIncrementGap=100/
   fieldType name=integer class=solr.IntField omitNorms=true/
  /types
 fields
  !-- BOOKS --
   field name=person_id type=integer indexed=true stored=true
multivalued=false required=true/
   field name=first_name type=text indexed=true stored=true
multivalued=false/
   field name=last_name type=text indexed=true stored=true
multivalued=false/
   field name=_age type=integer indexed=true stored=true
multivalued=false/
   field name=all type=text indexed=true stored=true
multivalued=true/
 /fields
 uniqueKeyperson_id/uniqueKey
 defaultSearchFieldall/defaultSearchField
 solrQueryParser defaultOperator=OR/
copyField source=first_name dest=all/
   copyField source=last_name dest=all/
   copyField source=_age dest=all/
/schema


Re: how can I check field which are indexed but not stored?

2009-03-20 Thread sunnyfr

Cool I was just having a look on it but it doesn't seem to show up field
which are not stored 
just tried :
/admin/luke?id=8582006fl=description

but it doesn't seems to work :( It find this id but show up stored field.

Did I do a mistake ?
thanks a lot


Markus Jelsma - Buyways B.V. wrote:
 
 
 
 On Fri, 2009-03-20 at 03:41 -0700, sunnyfr wrote:
 
 Hi
 
 I've an issue, I've some data which come up but I've applied a filtre on
 it
 and it shouldnt, when I check in my database mysql I've obviously the
 document which has been updated so I will like to see how it is in solr.
 
 if I do : /solr/video/select?q=id:8582006 I will just see field which has
 been stored. Is there a way to see how data are indexed for other field
 of
 my schema which are not stored but indexed.
 
 
 /solr/admin/luke
 will show you a lot of information concering stored and indexed fields.
 
 Hope this is what you meant.
 
 
 
 Like a bit in the console dataimporthandler, which with verbose activated
 I
 can see every field of my schema.
 
 Otherwise what would you reckon in this case, a document which has not
 been
 updated ? how can I sort it out?
 
 Thanks a lot guys for your excellent help
 
 

-- 
View this message in context: 
http://www.nabble.com/how-can-I-check-field-which-are-indexed-but-not-stored--tp22617914p22621773.html
Sent from the Solr - User mailing list archive at Nabble.com.



q.alt and highlights

2009-03-20 Thread Marc Sturlese

Is there any way to activate highlights using q.alt of dismax?
I have hl well configurated and working for normal q in the field content
(in the solr.xml). For q.alt, I try to do:
http://localhost:8080/solr/select/?q=q.alt=my_id:475836start=0rows=10hl=true
But no highlight is showed...
Any advice?
-- 
View this message in context: 
http://www.nabble.com/q.alt-and-highlights-tp22621774p22621774.html
Sent from the Solr - User mailing list archive at Nabble.com.



JVM exception_access_violation

2009-03-20 Thread wojtekpia

I'm running Solr on Tomcat 6.0.18 with Java 6 update 7 on Windows 2003 64
bit. Over the past month or so, my JVM has crashed twice with the error
below. Has anyone experienced this? My system is not heavily loaded, and the
crash seems to coincide with an update (via DIH). I'm running trunk code
from late January. Note that I update my index ~50 times per day, and this
crash has happened twice in the past month (so 2 of 1500 updates seem to
have triggered the crash).

This Windows deployment is for demos, so I'm not too concerned about it.
Interestingly, my production deployment is on a 64 bit Linux system (same
versions of everything) and I haven't been able to reproduce the bug there.

#
# An unexpected error has been detected by Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc005) at pc=0x080e51c3,
pid=4404, tid=956
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b23 mixed mode
windows-amd64)
# Problematic frame:
# V  [jvm.dll+0xe51c3]
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

---  T H R E A D  ---

Current thread (0x01de2000):  GCTaskThread [stack:
0x,0x] [id=956]

siginfo: ExceptionCode=0xc005, reading address 0x

Registers:
EAX=0x3000, EBX=0x01e40330, ECX=0x000184b49821,
EDX=0x000184b4b580
ESP=0x07cff9b0, EBP=0x, ESI=0x000184b4b580,
EDI=0x0935
EIP=0x080e51c3, EFLAGS=0x00010206

Top of Stack: (sp=0x07cff9b0)
0x07cff9b0:   01e40330 
0x07cff9c0:   000184b4dd88 0935
0x07cff9d0:   08464b08 01dbbdc0
0x07cff9e0:   01dbf190 8a65
0x07cff9f0:   2f5b4000 0002015f
0x07cffa00:   0002 01dbf2f0
0x07cffa10:   01e40330 01dbf430
0x07cffa20:   01dbf4f0 000201602d18
0x07cffa30:   07effa00 07cffb40
0x07cffa40:    
0x07cffa50:    0830484d
0x07cffa60:   0002015f 0002
0x07cffa70:   0048 0001
0x07cffa80:   0001 00bb8501
0x07cffa90:   01dbf378 080ea807
0x07cffaa0:   07cffb40 07cffb40 

Instructions: (pc=0x080e51c3)
0x080e51b3:   4c 8d 44 24 20 48 8b d6 48 8b 41 10 48 83 c1 10
0x080e51c3:   ff 90 c0 01 00 00 44 8b 1d 08 f2 44 00 45 85 db 


Stack: [0x,0x],  sp=0x07cff9b0, 
free space=127998k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
V  [jvm.dll+0xe51c3]

[error occurred during error reporting (printing native stack), id
0xc005]


---  P R O C E S S  ---

Java Threads: ( = current thread )
  0x10286c00 JavaThread Thread-135 daemon [_thread_blocked,
id=4892, stack(0x1169,0x1179)]
  0x10285400 JavaThread http-8084-10 daemon [_thread_blocked,
id=5108, stack(0x1201,0x1211)]
  0x10287400 JavaThread http-8084-9 daemon [_thread_blocked,
id=1772, stack(0x149a,0x14aa)]
  0x1028a400 JavaThread http-8084-8 daemon [_thread_blocked,
id=1656, stack(0x11f1,0x1201)]
  0x01dc2c00 JavaThread http-8084-7 daemon [_thread_blocked,
id=2056, stack(0x11e1,0x11f1)]
  0x10288400 JavaThread http-8084-6 daemon [_thread_blocked,
id=4792, stack(0x11d1,0x11e1)]
  0x10286800 JavaThread MultiThreadedHttpConnectionManager cleanup
daemon [_thread_blocked, id=3792,
stack(0x1251,0x1261)]
  0x0f6e8400 JavaThread http-8084-5 daemon [_thread_blocked,
id=3540, stack(0x11c1,0x11d1)]
  0x0f6e7800 JavaThread http-8084-4 daemon [_thread_blocked,
id=4048, stack(0x11b1,0x11c1)]
  0x0f6e8000 JavaThread http-8084-3 daemon [_thread_blocked,
id=1932, stack(0x1159,0x1169)]
  0x0f6e7000 JavaThread http-8084-2 daemon [_thread_blocked,
id=996, stack(0x1149,0x1159)]
  0x01dc6000 JavaThread http-8084-1 daemon [_thread_blocked,
id=4924, stack(0x1139,0x1149)]
  0x01dc5800 JavaThread TP-Monitor daemon [_thread_blocked,
id=2288, stack(0x1121,0x1131)]
  0x01dc5400 JavaThread TP-Processor4 daemon [_thread_in_native,
id=4588, stack(0x,0x1121)]
  0x01dc4c00 JavaThread TP-Processor3 daemon [_thread_blocked,
id=652, stack(0x1101,0x)]
  0x01dc4400 JavaThread TP-Processor2 

Re: stop word search

2009-03-20 Thread revas
Hi Erik,

I have now commented the query time stopword analyzer .I restarted the
server.But now when i search for a stop word ,i am getting results.

We had earlier indexed the content with the stop word analyzer.I dont think
we need to reindex after commentting the query analyzer,right?

This field is a text field with the defaul analyzer.

Please let me know if i have missed something here.

Regards
Sujatha


On 3/17/09, Erick Erickson erickerick...@gmail.com wrote:

 Well, by definition, using an analyzer that removes stopwords
 *should* do this at query time. This assumes that you used
 an analyzer that removed stopwords at index and query time.
 The stopwords are not in the index.

 You can get the behavior you expect by using an analyzer at
 query time that does NOT remove stopwords, and one at
 indexing time that *does* remove stopwords. Gut I'm having a
 hard time imagining that this would result in a good user experience.

 I mean anytime that you had a stopword in the query where the
 stopword was required, no results would be returned. Which would
 be hard to explain to a user

 What is it you're trying to accomplish?

 Best
 Erick



 On Tue, Mar 17, 2009 at 7:40 AM, revas revas...@gmail.com wrote:

  Hi,
 
  I have a query like this
 
  content:the AND iuser_id:5
 
  which means return all docs of user id 5 which have the word the in
  content .Since 'the' is a stop word ,this query executes as just user_id
 :5
  inspite of the AND clause ,Whereas the expected result here is since
  there
  is no result for  the  ,no results shloud be returned.
 
  Am i missing anythin here?
 
  Regards
 



Re: Error in identifying the primary key

2009-03-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
for all the fields mentioned in data-config.xml there should be a
counterpart in schema.xml

anyway that is relaxed in the latest nightly



On Fri, Mar 20, 2009 at 6:26 PM, radha c radhas...@gmail.com wrote:
 Hi,

 I am new to Solr. I am trying to index SQL table rows.
 I am getting the below error. Can anyone help me in resolving this issue.

 Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImporter
 verifyWithSchema
 INFO: id is a required field in SolrSchema . But not found in DataConfig
 Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImportHandler
 inform
 SEVERE: Exception while loading DataImporter
 org.apache.solr.handler.dataimport.DataImportHandlerException: There are
 errors in the Schema
 The field :age present in DataConfig does not have a counterpart in Solr
 Schema
 The field :firstname present in DataConfig does not have a counterpart in
 Solr Schema
 The field :lastName present in DataConfig does not have a counterpart in
 Solr Schema

        at
 org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:108)
        at
 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95)
        at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
        at org.apache.solr.core.SolrCore.init(SolrCore.java:571)
        at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121)
        at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
        at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
        at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
        at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78)
        at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
        at
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
        at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
        at
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
        at
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
        at
 org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831)
        at
 org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720)
        at
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490)
        at
 org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150)
        at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
        at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
        at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
        at
 org.apache.catalina.core.StandardHost.start(StandardHost.java:736)
        at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
        at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
        at
 org.apache.catalina.core.StandardService.start(StandardService.java:448)
        at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
        at org.apache.catalina.startup.Catalina.start(Catalina.java:552)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)
 Mar 20, 2009 6:03:38 PM org.apache.solr.servlet.SolrDispatchFilter init
 SEVERE: Could not start SOLR. Check solr/home property
 org.apache.solr.common.SolrException: FATAL: Could not create importer.
 DataImporter config invalid
        at
 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:103)
        at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
        at org.apache.solr.core.SolrCore.init(SolrCore.java:571)
        at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121)
        at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)

 Thanks




-- 
--Noble Paul


Re: delta-import commit=false doesn't seems to work

2009-03-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
just hit the DIH without any command and you may be able to see the
status of the last import. It can tell you whether a commit/optimize
was performed

On Fri, Mar 20, 2009 at 7:07 PM, sunnyfr johanna...@gmail.com wrote:

 Thanks I gave more information there :
 http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html

 thanks a lot Paul


 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 sorry, the whole thing was commented . I did not notice that. I'll
 look into that

 2009/3/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 you have set autoCommit every x minutes . it must have invoked commit
 automatically


 On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote:

 Hi,

 Even if I hit command=delta-importcommit=falseoptimize=false
 I still have commit set in my logs and sometimes even optimize=true,

 About optimize I wonder if it comes from commitment too close and one is
 not
 done, but still I don't know really.

 Any idea?

 Thanks a lot,
 --
 View this message in context:
 http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul




 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


DIH data-config loading

2009-03-20 Thread Rui Pereira
I'm trying to load or delete entities in data-config in runtime, changing
the data-config.xml file, reload and delete or full-import as needed.My
question is: does data-config gets loaded into memory in runtime an reload
only, that is, can I change the file while solr is importing or deleting
data?
Another question: to delete documents, a different handler from import is
used (update), is it problematic to delete documents from a determinate
entity while importing?

Thanks in advance,
   Rui Pereira


Re: delta-import commit=false doesn't seems to work

2009-03-20 Thread sunnyfr

Like you can see, I did that and I've no information in my DIH but you can
notice in my logs and even my segments 
that and optimize is fired alone automaticly?


Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
 just hit the DIH without any command and you may be able to see the
 status of the last import. It can tell you whether a commit/optimize
 was performed
 
 On Fri, Mar 20, 2009 at 7:07 PM, sunnyfr johanna...@gmail.com wrote:

 Thanks I gave more information there :
 http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html

 thanks a lot Paul


 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 sorry, the whole thing was commented . I did not notice that. I'll
 look into that

 2009/3/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@gmail.com:
 you have set autoCommit every x minutes . it must have invoked commit
 automatically


 On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote:

 Hi,

 Even if I hit command=delta-importcommit=falseoptimize=false
 I still have commit set in my logs and sometimes even optimize=true,

 About optimize I wonder if it comes from commitment too close and one
 is
 not
 done, but still I don't know really.

 Any idea?

 Thanks a lot,
 --
 View this message in context:
 http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul




 --
 --Noble Paul



 --
 View this message in context:
 http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22625149.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: solrj : probleme with utf-8 content

2009-03-20 Thread Ryan McKinley

do you know if your java file is encoded with utf-8?

sometimes it will be encoded as something different and that can cause  
funny problems..



On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote:

when executing this code I got in my index the field includes with  
this

value : ?  ? ? :
---
String content =eaiou with circumflexes: êâîôû;
SolrInputDocument doc = new SolrInputDocument();
doc.addField( id, 123, 1.0f );
doc.addField( includes, content, 1.0f );
server.add( doc );
---

but this code works fine :

---
String addContent =   adddoc boost=1.0
 +field name=id123/fieldfield
name=includeseaiou with circumflexes:âîôû/field
 +/doc/add;
DirectXmlRequest up = new DirectXmlRequest( /update, addContent );
server.request( up );
---

thanks for help




Re: DIH data-config loading

2009-03-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Mar 20, 2009 at 10:57 PM, Rui Pereira ruipereira...@gmail.com wrote:
 I'm trying to load or delete entities in data-config in runtime, changing
 the data-config.xml file, reload and delete or full-import as needed.My
 question is: does data-config gets loaded into memory in runtime an reload
 only, that is, can I change the file while solr is importing or deleting
 data?
it is safe to edit the data-config.xml . The reload happens only only
if you issue the command=reload-config

 Another question: to delete documents, a different handler from import is
 used (update), is it problematic to delete documents from a determinate
 entity while importing?
Solr does not have an issue , but be aware that the commit may be
happening after the import and if that is OK for your data then it
should be OK

 Thanks in advance,
   Rui Pereira




-- 
--Noble Paul


Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
May be there is an issue with the recent changes with SOLR-973
I have given a new patch on SOLR-973
aerox ,is it possible to confirm if that is the problem


On Fri, Mar 20, 2009 at 6:52 PM, Grant Ingersoll gsing...@apache.org wrote:
 Usually, when I see characters like this, it means you aren't
 viewing/handling the UTF-8 correctly when bringing it into Java.  I would
 first check that your DB or JDBC driver is getting the chars out right.  It
 may even be the case that they did not go into the DB correctly in the first
 place.

 On Mar 20, 2009, at 4:36 AM, aerox7 wrote:


 == where are you seeing it as Solène as opposed to the
 correct way of solène?

 I have Solène in my Mysql DATA BASE ! so i don't know if this is
 correct
 or not ? i gess that Solène is solène in UTF-8 ?!

 I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
 when i try with solène everything is ok ! but when i try with Solène
 (like
 what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!!

 I think that ISOLatin1AccentFilterFactory take only string with Charset
 ISO-8859-1 .

 So any solution to transform my string to ISO-8859-1 before indexing
 process. May be by creating transformer in DataImportHandler ? (Never code
 in java :( )

 Thank you all.


 Koji Sekiguchi-2 wrote:

 aerox7 wrote:

 Hi,
 I have a mysql data base in UTF-8. I have a row with Solène (solène).
 I
 want to transforme this to solene, so i use Solr
 ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!!

 i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8
 so
 normaly ISOLatin1AccentFilterFactory have to replace the accent ...

 any ideas ?

 i use DataImportHandler.


 If a mapping rule è to e is always true in your field, you can try
 to use MappingCharFilter
 instead of ISOLatin1AccentFilter. Add the following line to
 mapping-ISOLatin1Accent.txt:

 è = e

 and add the following fieldType:

 fieldType name=textCharNorm class=solr.TextField
 positionIncrementGap=100 
  analyzer
   charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
  /analyzer
 /fieldType

 MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.

 Koji





 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html
 Sent from the Solr - User mailing list archive at Nabble.com.






-- 
--Noble Paul


Re: solrj : probleme with utf-8 content

2009-03-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
SOLR-973 seems to have caused the problem

On Fri, Mar 20, 2009 at 11:01 PM, Ryan McKinley ryan...@gmail.com wrote:
 do you know if your java file is encoded with utf-8?

 sometimes it will be encoded as something different and that can cause funny
 problems..


 On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote:

 when executing this code I got in my index the field includes with this
 value : ?  ? ? :
 ---
 String content =eaiou with circumflexes: êâîôû;
 SolrInputDocument doc = new SolrInputDocument();
 doc.addField( id, 123, 1.0f );
 doc.addField( includes, content, 1.0f );
 server.add( doc );
 ---

 but this code works fine :

 ---
 String addContent =   adddoc boost=1.0
                             +field name=id123/fieldfield
 name=includeseaiou with circumflexes:âîôû/field
                             +/doc/add;
 DirectXmlRequest up = new DirectXmlRequest( /update, addContent );
 server.request( up );
 ---

 thanks for help





-- 
--Noble Paul


Re: Page-Rank algorithm

2009-03-20 Thread Otis Gospodnetic

Victor,

Solr knows nothing about hyperlinks, web pages, and such.  Solr doesn't even 
have a web crawler.  You should ask on nutch-u...@lucene... mailing list 
instead.  The answer there will be positive.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Huang, Zijian(Victor) zijian.hu...@etrade.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, March 19, 2009 5:55:36 PM
 Subject: Page-Rank algorithm
 
 Hi, 
Do you guys know if there is some versions of the page-rank algorithm
 already implemented in Solr(Lucene)? If not, how hard is it to
 implement. I am trying to improve the ranking relevance for Solr.
 
 Thanks
 
 
 Vic



Re: solrj : probleme with utf-8 content

2009-03-20 Thread Pascal Dimassimo

yes, now it works fine with the trunk sources

thanks!


Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
 SOLR-973 seems to have caused the problem
 
 On Fri, Mar 20, 2009 at 11:01 PM, Ryan McKinley ryan...@gmail.com wrote:
 do you know if your java file is encoded with utf-8?

 sometimes it will be encoded as something different and that can cause
 funny
 problems..


 On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote:

 when executing this code I got in my index the field includes with
 this
 value : ?  ? ? :
 ---
 String content =eaiou with circumflexes: êâîôû;
 SolrInputDocument doc = new SolrInputDocument();
 doc.addField( id, 123, 1.0f );
 doc.addField( includes, content, 1.0f );
 server.add( doc );
 ---

 but this code works fine :

 ---
 String addContent =   adddoc boost=1.0
                             +field name=id123/fieldfield
 name=includeseaiou with circumflexes:âîôû/field
                             +/doc/add;
 DirectXmlRequest up = new DirectXmlRequest( /update, addContent );
 server.request( up );
 ---

 thanks for help


 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/solrj-%3A-probleme-with-utf-8-content-tp22577377p22627715.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: stop word search

2009-03-20 Thread Erick Erickson
Yes, you do need to reindex after removing the stopword filter
from the configuration. When you indexed the first time using
the stopword filter, the words were NOT indexed, so they won't
be found now that they're getting through the query analyzer.

Best
Erick

On Fri, Mar 20, 2009 at 1:02 PM, revas revas...@gmail.com wrote:

 Hi Erik,

 I have now commented the query time stopword analyzer .I restarted the
 server.But now when i search for a stop word ,i am getting results.

 We had earlier indexed the content with the stop word analyzer.I dont think
 we need to reindex after commentting the query analyzer,right?

 This field is a text field with the defaul analyzer.

 Please let me know if i have missed something here.

 Regards
 Sujatha


 On 3/17/09, Erick Erickson erickerick...@gmail.com wrote:
 
  Well, by definition, using an analyzer that removes stopwords
  *should* do this at query time. This assumes that you used
  an analyzer that removed stopwords at index and query time.
  The stopwords are not in the index.
 
  You can get the behavior you expect by using an analyzer at
  query time that does NOT remove stopwords, and one at
  indexing time that *does* remove stopwords. Gut I'm having a
  hard time imagining that this would result in a good user experience.
 
  I mean anytime that you had a stopword in the query where the
  stopword was required, no results would be returned. Which would
  be hard to explain to a user
 
  What is it you're trying to accomplish?
 
  Best
  Erick
 
 
 
  On Tue, Mar 17, 2009 at 7:40 AM, revas revas...@gmail.com wrote:
 
   Hi,
  
   I have a query like this
  
   content:the AND iuser_id:5
  
   which means return all docs of user id 5 which have the word the in
   content .Since 'the' is a stop word ,this query executes as just
 user_id
  :5
   inspite of the AND clause ,Whereas the expected result here is since
   there
   is no result for  the  ,no results shloud be returned.
  
   Am i missing anythin here?
  
   Regards
  
 



Re: Stemming in Solr

2009-03-20 Thread Chris Hostetter


: Can someone please let me know how to implement stemming in solr. I am
: particularly looking of the changes, I might need to do in the config files
: and also if I need to use some already supplied libraries/factories etc etc.

i would start by searching the wiki and email archives for stemming...

http://wiki.apache.org/solr/?action=fullsearchcontext=180value=stemmingfullsearch=Text


-Hoss



Re: Special Characters search in solr

2009-03-20 Thread Chris Hostetter

: Yes, I did and below is my debugQuery result.

before you even look at the debug section, look at the params section in 
the responseHeader...

:   str name=qColo�/str 

the raw value Solr is getting from your servlet container doesn't match 
what you think you are sending...

: It is actually converting Coloèr to Colo� and hence not searching. It is

...i'm guessing that either your servlet container is missconfigured for 
dealing with UTF-8 characters, or your client code is doing something not 
quite right ... untill you get that value you expect to see coming back in 
that responseHeader, there's no point in fiddling with your schema.


-Hoss


Re: Issue with Facet Query

2009-03-20 Thread Chris Hostetter

: I am using this query only but I am getting the same results. 
: 
: 
: 
facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100]
...
: It still is not showing up the other values. Do I need to make any entry in
: schema or solrConfig xml files. Do I need to convert the string into numeric
: values etc etc.
...
:  It is only returning results, which are having values started with 2, 3,
:  4
:  or some other integer instead of only 1. It is not returning records in
:  which value is 10 and 100.

your fq param is saying you only want docs matching values between 1 and 
100, you seem to be using a string type, so it's not going to match 
anything starting with a character other then a 1 ... if it doens't 
match any docs with values like 23 then the facet counts for 23 are 
going to be 0 as well.

reading between the lines, i think you missunderstood Shalin about 10 
messages ago ... fq is for providing a *filter* query, it restricts the 
results of your entire query.  facet.query is for faceting on an arbitrary 
query (which can be a range query)

if you search for 'ipod' and you want to get back *all* the
documents that match, but you also want to know how many of those have a 
price between $10 and $100 use a facet.query.

if you search for 'ipod' and you want to get back *only* the documents
that have a price between $10 and $100 use an fq.

...but either way: yes, convert to a numeric field type so that your 
ranges will actually work properly.



-Hoss



Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory

2009-03-20 Thread aerox7

Hi,
I've cheked MySql conf with mysql SHOW VARIABLES LIKE 'character_set%'; 
: all character_set are in UTF-8.

I think that dataimporter get data in ISO. so the i just write a custom
transformer to change the row's charset from iso to utf and now it work.

-- Noble Paul : I use SOLR 1.4 Nighty 2009-03-18 build. i have to download
the last one to apply your patch ?


Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
 May be there is an issue with the recent changes with SOLR-973
 I have given a new patch on SOLR-973
 aerox ,is it possible to confirm if that is the problem
 
 
 On Fri, Mar 20, 2009 at 6:52 PM, Grant Ingersoll gsing...@apache.org
 wrote:
 Usually, when I see characters like this, it means you aren't
 viewing/handling the UTF-8 correctly when bringing it into Java.  I would
 first check that your DB or JDBC driver is getting the chars out right.
  It
 may even be the case that they did not go into the DB correctly in the
 first
 place.

 On Mar 20, 2009, at 4:36 AM, aerox7 wrote:


 == where are you seeing it as Solène as opposed to the
 correct way of solène?

 I have Solène in my Mysql DATA BASE ! so i don't know if this is
 correct
 or not ? i gess that Solène is solène in UTF-8 ?!

 I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
 when i try with solène everything is ok ! but when i try with Solène
 (like
 what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!!

 I think that ISOLatin1AccentFilterFactory take only string with Charset
 ISO-8859-1 .

 So any solution to transform my string to ISO-8859-1 before indexing
 process. May be by creating transformer in DataImportHandler ? (Never
 code
 in java :( )

 Thank you all.


 Koji Sekiguchi-2 wrote:

 aerox7 wrote:

 Hi,
 I have a mysql data base in UTF-8. I have a row with Solène
 (solène).
 I
 want to transforme this to solene, so i use Solr
 ISOLatin1AccentFilterFactory to perform this task but it dosn't work
 ?!!

 i gess that Solène is solène in UTF-8 ?! i also set tomcat to
 utf-8
 so
 normaly ISOLatin1AccentFilterFactory have to replace the accent
 ...

 any ideas ?

 i use DataImportHandler.


 If a mapping rule è to e is always true in your field, you can try
 to use MappingCharFilter
 instead of ISOLatin1AccentFilter. Add the following line to
 mapping-ISOLatin1Accent.txt:

 è = e

 and add the following fieldType:

 fieldType name=textCharNorm class=solr.TextField
 positionIncrementGap=100 
  analyzer
   charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/
  /analyzer
 /fieldType

 MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.

 Koji





 --
 View this message in context:
 http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html
 Sent from the Solr - User mailing list archive at Nabble.com.



 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22633051.html
Sent from the Solr - User mailing list archive at Nabble.com.