You are right about the parsing of query terms without a double quote
(solrQueryParser's defaultOperator has to be AND in your case). For the
problem at hand, two things -
1. Do you have any reason for not doing a PhraseQuery (query terms
enclosed in double quotes) on your edgytext field?
are you referring to DIH?
yes
2009/10/27 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com
2009/10/26 Licinio Fernández Maurelo licinio.fernan...@gmail.com:
Hi there,
i must enhance solr config deploys.
I have a configuration file per environment and per role (Master-Slave)
so i
On Oct 27, 2009, at 6:36 AM, markus.rietz...@rzf.fin-nrw.de markus.rietz...@rzf.fin-nrw.de
wrote:
hi,
we want to use SOLR as our intranet search engine.
i downloaded the nightly bild of solr 1.4. pdf extraction does via
Solr Cell/Tika. i can send the pdf via curl
to solr.
we do have a
thanxs,
i know and read that page. sending additional meta-tags with the curl call is
no problem. i only thought that there might be a way to use the xml-approach
also with PDF files. i'll go the curl-way for that files.
--
mit freundlichen Grüßen
Markus Rietzler - rietzler_software/
You can send PDF files via SolrJ:
http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/
I'm sure the various other clients could do the same thing. All you
really need is a way to upload the files.
Still, sending lots
Hi All,
I'm using a multified query parser to generated weighted queries
across different fields.
For instance, perl developer gives me:
+(title:perl^10.0 keywords:perl company:perl^3.0)
+(title:developer^10.0 keywords:developer company:developer^3.0)
Either in solr 1.3 or solr 1.4 (from 12 oct
On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé jerome.et...@gmail.com wrote:
I don't really get why these two tokens are subsequently put together
in a phrase query.
That's the way the Lucene query parser has always worked... phrase
queries are made if multiple tokens are produced from one field
Hi,
If I use the ExtractingRequestHandler
http://wiki.apache.org/solr/ExtractingRequestHandler on a local file
(as shown in http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput),
all works well, but how do I do this for files located on a server?
e.g. (works)
curl
On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com wrote:
Are multiple CPUs utilized at indexing time as well, or just by searcher?
Yes, multiple CPUs are utilized for indexing.
If you're
You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml
). Otherwise, look into a crawler such as Nutch or Droids or Heretrix.
-Grant
On Oct 27, 2009, at 11:14 AM, Insight 49, LLC wrote:
Hi,
If I use the ExtractingRequestHandler
On Tue, Oct 27, 2009 at 10:23 AM, gabriele renzi rff@gmail.com wrote:
On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com
wrote:
Are multiple CPUs utilized at indexing time as well, or just
curl reads from local file or stdin, so you could do something like
if it only a single file from a webserver
curl http://someserver/file.html/ | curl
http://localhost:8983/solr/update/extract?extractOnly=true; -F na...@-
but this way no crawling, no link following etc...
--
mit
Grant Ingersoll wrote:
You might try remote streaming with Solr (see
http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a
crawler such as Nutch or Droids or Heretrix.
Additionally, Nutch can be configured to send the crawled/parsed
documents to Solr for indexing.
--
Best
Hum,
That's probably because of our own customized types/tokenizers/filters.
I tried reindexing and querying our data using the default solr type
'textgen' and it works fine.
I need to investigate which features of the new lucene 2.9 API is not
implemented in our own tokenizers etc...
Thanks.
Actually here is the difference between the textgen analysis pipeline and our:
For the phrase ingenieur d'affaire senior ,
Our pipeline gives right after our tokenizer:
term position 1 2 3 4
term text ingenieur d affaire senior
'd' and 'affaire' are
Andrzej Bialecki wrote:
Grant Ingersoll wrote:
You might try remote streaming with Solr (see
http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a
crawler such as Nutch or Droids or Heretrix.
Additionally, Nutch can be configured to send the crawled/parsed
documents to Solr
Hello All,
I am a newbie, learning Solr - plugins concept. While
following the tutorials on the same from
http://wiki.apache.org/solr/SolrPlugins , I have tried to work on
Query Parser plugin concept by extending QParserPlugin class.
I have registered my custom plugin class in
Hi,
currently on dev. I use following command (in a browser) to create a new
core (we use Solr in a multi-core mode):
Hi There,
Is there a way to get facet.query= to ignore the fq= param? We want to
do a query like this:
select?fl=*start=0q=coolfq=in_stock:truefacet=truefacet.query=in_stock:falseqt=dismax
To understand the count of items not in stock, when someone has
filtered items that are in stock. Or is
Thanks, that was just what I was looking for!
On Tue, Oct 27, 2009 at 1:27 PM, Jérôme Etévé jerome.et...@gmail.com wrote:
Hi,
you need to 'tag' your filter and then exclude it from the faceting.
An example here:
Folks:
My db contains approx 6M records -- on average each is approx 1K bytes. When
I use the DIH, I reliably get an OOM exception. The machine has 4 GB ram,
my tomcat is set to use max heap of 2G.
The option of increasing memory is not tenable coz as the number of documents
grows I
Hi,
I got the same problem using DIH with a large dataset in MySql database.
Following :
http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html,
and looking at the java code, it appears that DIH use PreparedStatement
in the JdbcDataSource.
I set the
Use lt; instead of in that attribute. That should fix the issue.
Remember, it's an XML file, so it has to obey XML encoding rules which
make it ugly but whatcha gonna do?
Erik
On Oct 27, 2009, at 11:50 AM, Andrew Clegg wrote:
Hi,
If I have a DataImportHandler query with a
Heh, eventually I decided
where 4 node_depth
was the most pleasing (if slightly WTF-ish) way of writing it...
Cheers,
Andrew.
Erik Hatcher-4 wrote:
Use lt; instead of in that attribute. That should fix the issue.
Remember, it's an XML file, so it has to obey XML encoding rules
: i found out when i use the new feature solr.ReversedWildcardFilterFactory
that the following happens:
Hmmm in spite of ReversedWildcardFilterFactory being somethng designed
to be used as part of the analysis chain for a fields *so* you can do
leading wildcard queries on them, it doesn't
On Oct 27, 2009, at 12:58 PM, Phanindra Reva wrote:
Hello All,
I am a newbie, learning Solr - plugins concept. While
following the tutorials on the same from
http://wiki.apache.org/solr/SolrPlugins , I have tried to work on
Query Parser plugin concept by extending QParserPlugin
: More appropriate would be to use a commandline utility like 'curl' to
: create the new cores.
:
: curl 'http://production.example.com:8080/solr/admin/cores?rest-of-request'
More specificly: HTTP is the only way to admin a Solr port while running.
All of the existing scripts htat come
On Oct 27, 2009, at 9:44 AM, patric.wi...@rtl.de
patric.wi...@rtl.de wrote:
Hey,
i found out when i use the new feature
solr.ReversedWildcardFilterFactory that the following happens:
I query for thomas.
str name=rawquerystringnickname:thomas/str
str name=querystringnickname:thomas/str
How big is your index? Can you share your solrconfig? Have you
looked at it in a profiler during this time? What is it doing?
-Grant
On Oct 26, 2009, at 8:32 PM, Teruhiko Kurosaka wrote:
I've been testing Solr 1.4.0 (RC).
After sometime, solr started to pause
for a long time (a minutes
On Oct 26, 2009, at 1:14 AM, Rakhi Khatwani wrote:
Hi,
i was trying to insert one million records in solr (keeping the
id from
0 to 100). things were fine till it inserted (id = 523932).
after that
it started inserting it from 1 (i.e updating). i am not able to
understand
this
On Tue, Oct 27, 2009 at 4:14 PM, Grant Ingersoll gsing...@apache.org wrote:
I'm pretty sure wildcard queries don't go through analysis, hence they are
probably not stemmed.
Right - same thing would happen w/o the reverse filter.
Also, wildcarding mixes poorly with stemming - trying to analyze
: If I issue two optimize / requests with no intervening changes to the
: index, will the second optimize request be smart enough to not do
: anything?
the actual optimize is optimized, but postCommit and postOptimize event
listeners will still be fired.
-Hoss
Hello,
Is there a solution packaged in SOLR that deserializes XML response
documents into Lucene documents?
I've been making modifications here and there to the Solr source code in
hopes to optimize for my particular setup. My goal now is to establish a
descent benchmark toolset so that I can evaluate the observed performance
increase before deciding rolling out. So far I've investigated Jmeter and
From: Grant Ingersoll [mailto:gsing...@apache.org]
Sent: Tuesday, October 27, 2009 1:15 PM
To: solr-user@lucene.apache.org
Subject: Re: long startup time
How big is your index? Can you share your solrconfig? Have
you looked at it in a profiler during this time? What is it doing?
The
Hey guys! Don't forget this is tomorrow (Wednesday). See you there!
Cheers,
Bradford
On Sun, Oct 18, 2009 at 5:10 PM, Bradford Stephens
bradfordsteph...@gmail.com wrote:
Greetings,
(You're receiving this e-mail because you're on a DL or I think you'd
be interested)
It's time for another
Hi all,
I was trying to use the new 'shareSchema=true' feature in solr 1.4 and
it appears as though this will only happen in one configuration. I'd like
someone to confirm this for me and then we can file a bug on it.
This all happens in CoreContainer.create().
When you have shareSchema=true
Is there any way to get 1.3 Solr to use something other than java logging?
Am running solr inside tomcat and would like logging for solr to be directed
to one set of (rotated) log files and leave tomcat logging in its own log files.
Also, with 1.4, I see it requires removal of jar and swapping
Hi, Gilbert:
Thanks for your tip! I just tried it. Unfortunately, it does not work for
me. I still get the OOM exception.
How large was your dataset? And what were your machine specs?
Cheers,
- Bill
--
From: Gilbert Boyreau
Hi Thomas,
If you check out SOLR-1516, I developed a custom response writer that
simplifies this process. You basically have to implement an #emitDoc or an
#emitDocList function in which you are handed the resultant o.a.l.Document List
or o.a.l.Document object (on a per Document basis) and you
Mike,
For response times I would also look at java.net's Faban benchmarking
framework. We use it extensively for our acceptance tests and tuning
excercises.
Joshua
On Oct 27, 2009, at 1:59 PM, Mike Anderson wrote:
I've been making modifications here and there to the Solr source
code
Have two cores with some common fields in their schemas. I want to
perform a MLT query on one core and get results from the other schema.
Both cores have same type of id.
I saw this thread:
http://www.nabble.com/Does-MoreLikeThis-support-sharding--td25378654.html
This is not quite what I am
I have been playing with this using the analysis.jsp. I am still not clear
why we don't want to catenate at query time. Here is my example.
With the current text field, the query term iPhone will not match document
containing the string iphone because iPhone is analyzed into two terms:
i(1) and
(cross posted to many user lists, please confine reply to gene...@lucene)
There will be a Lucene meetup next week at ApacheCon in Oakland, CA on
Tuesday, November 3rd. Meetups are free (the rest of the conference is
not). See: http://wiki.apache.org/lucene-java/LuceneAtApacheConUs2009
For
This is different for each model of computer. It has to do with
exactly what chips are used.
On Fri, Oct 23, 2009 at 10:42 AM, Jérôme Etévé jerome.et...@gmail.com wrote:
2009/10/23 Andrzej Bialecki a...@getopt.org:
Jérôme Etévé wrote:
Hi all,
I'm using Solr trunk from 2009-10-12 and I
hi,
Looks like a bug. open an issue.
On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner
jer...@hinegardner.org wrote:
Hi all,
I was trying to use the new 'shareSchema=true' feature in solr 1.4 and
it appears as though this will only happen in one configuration. I'd like
someone to confirm
On Tue, Oct 27, 2009 at 10:31 PM, Bill Au bill.w...@gmail.com wrote:
Here is my example.
With the current text field, the query term iPhone will not match document
containing the string iphone because iPhone is analyzed into two terms:
i(1) and phone(2).
Right. The limitations are known...
I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527
2009/10/28 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com:
hi,
Looks like a bug. open an issue.
On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner
jer...@hinegardner.org wrote:
Hi all,
I was trying to use the new
I just tested your patch, it works for me. I vote for getting this in on the
1.4 release :-)
enjoy,
-jeremy
On Wed, Oct 28, 2009 at 09:15:03AM +0530, Noble Paul ?
?? wrote:
I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527
2009/10/28
Thanks Avlesh. The issue with not doing a phrase query on my edgytext field
was that my parent application was adding an escape character to the quotation
marks, and I was hoping to fix (or rather, work around) at the solr end to save
maintenance overhead. But I've done a hack in the parent
My next issue relates to how to get the results of the author field come up
in a search across all fields. For example, a search on author:Houghton, B
(which uses the edgytext) yields 16 documents, but a search on
all:Houghton, B (which doesn't) yields only 9. I thought the solution
should
Have two cores with some common fields in their schemas. I want to perform
a MLT query on one core and get results from the other schema. Both cores
have same type of id.
Having the same type of id in two different cores is of no good for a MLT
handler (which in-fact operates on one core)
52 matches
Mail list logo