Not sure exactly what you mean, can you give an example?
Upayavira
On Sat, Jan 12, 2013, at 06:32 AM, J Mohamed Zahoor wrote:
> Cool… it worked… But the count of all the groups and the count inside
> stats component does not match…
> Is that a bug?
>
> ./zahoor
>
>
> On 11-Jan-2013, at 6:48 PM
Cool… it worked… But the count of all the groups and the count inside stats
component does not match…
Is that a bug?
./zahoor
On 11-Jan-2013, at 6:48 PM, Upayavira wrote:
> could you use field collapsing? Boost by date and only show one value
> per group, and you'll have the most recent docum
Awesome!
This one line did the trick:
http://lucene.472066.n3.nabble.com/how-to-perform-a-delta-import-when-related-table-is-updated-tp4032587p4032671.html
Sent from the Solr - User mailing list archive at Nabble.com.
It's an ExtractingRequestHandler parameter (see the wiki). Not quite sure the
Java incantation to set that but definitely possible.
Erik
On Jan 11, 2013, at 17:14, uwe72 wrote:
> Erik, what do u mean with this parameter, i don't find it..
>
>
>
> --
> View this message in context:
>
Erik, what do u mean with this parameter, i don't find it..
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrJ-ContentStreamUpdateRequest-Accessing-parsed-items-without-committing-to-solr-tp4032636p4032656.html
Sent from the Solr - User mailing list archive at Nabble.com.
Look at the extractOnly parameter.
But doing this in your client is the more recommended way of doing this to keep
Solr from getting beat up too bad.
Erik
On Jan 11, 2013, at 15:55, uwe72 wrote:
> i have a bit strange usecase.
>
> when i index a pdf to solr i use ContentStreamUpdateReq
On 1/11/2013 1:33 PM, Achim Domma wrote:
"At the base, Solr indexes are Lucene indexes, so one can always
drop down to that level."
That's what I'm looking for. I understand, that at the end, there has to be an inverse index (or rather
multiple of them), holding all "words" which occurre in m
ok, seems this works:
Tika tika = new Tika();
String tokens = tika.parseToString(file);
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrJ-ContentStreamUpdateRequest-Accessing-parsed-items-without-committing-to-solr-tp4032636p4032649.html
Sent from the Sol
Yes, i don't really want to index/store the pdf document in lucene.
i just need the parsed tokens for other things.
So you mean i can use ExtractingRequestHandler.java to retrieve the items.
has anybody a piece of code, doing that?
actually i give the pdf as input and want the parsed items (the
If I understand it, you are sending the file to Solr which then uses Tika
library to do the preprocessing/extraction and stores the results in the
defined fields .
If you don't want Solr to do the storing and want to change extracted
fields, just use the Tika library in your client and work with r
Moreover just checked .. autoGeneratePhraseQueries="true" is set for both
3.4 and 4.0 in my schema.
Thanks
Varun
On Fri, Jan 11, 2013 at 1:04 PM, varun srivastava wrote:
> Hi Jack,
> Is this a new change done in solr 4.0 ? Seems autoGeneratePhraseQueries
> option is present from solr 3.1. Just
Hi Jack,
Is this a new change done in solr 4.0 ? Seems autoGeneratePhraseQueries
option is present from solr 3.1. Just wanted to confirm this is the
difference causing change in behavior between 3.4 and 4.0.
Thanks
Varun
On Mon, Dec 24, 2012 at 3:00 PM, Jack Krupansky wrote:
> Thanks. Sloppy p
i have a bit strange usecase.
when i index a pdf to solr i use ContentStreamUpdateRequest.
The lucene document then contains in the "text" field all containing items
(the parsed items of the physical pdf).
i also need to add these parsed items to another lucene document.
is there a way, to recei
Try adding the "pk" attribute to the parent entity in any of these 4 ways:
mailto:vettepa...@hotmail.com]
Sent: Friday, January 11, 2013 1:18 PM
To: solr-user@lucene.apache.org
Subject: RE: how to perform a delta-import when related table is updated
Hi James,
Ok, so I did this:
Have you looked at Solr admin interface in details? Specifically, analysis
section under each core. It provides some of the statistics you seem to
want. And, gives you the source code to look at to understand how to create
your own version of that. Specifically, the "Luke" package is what you
might
On 12 January 2013 02:03, Achim Domma wrote:
> "At the base, Solr indexes are Lucene indexes, so one can always
> drop down to that level."
>
> That's what I'm looking for. I understand, that at the end, there has to be
> an inverse index (or rather multiple of them), holding all "words" which
"At the base, Solr indexes are Lucene indexes, so one can always
drop down to that level."
That's what I'm looking for. I understand, that at the end, there has to be an
inverse index (or rather multiple of them), holding all "words" which occurre
in my documents, each "word" having a list of d
On 12 January 2013 01:06, Achim Domma wrote:
>
> Hi,
>
> I have just setup my first Solr 4.0 instance and have added about one
> million documents. I would like to access the raw data stored in the index.
> Can somebody give me a starting point how to do that?
>
> As a first step, a simple dump wo
Hi,
I have just setup my first Solr 4.0 instance and have added about one million
documents. I would like to access the raw data stored in the index. Can
somebody give me a starting point how to do that?
As a first step, a simple dump would be absolutely ok. I just want to play
around and do s
Hi James,
Ok, so I did this:
I now get this error in the logfile:
SEVERE: Delta Import Failed
java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
declared primary key pk='ID'
Peter,
See http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command ,
then scroll down to where it says "The deltaQuery in the above example only
detects changes in item but not in other tables..." It shows you two ways to
do it.
Option 1: add a reference to the last_modified
On 1/11/2013 9:15 AM, Markus Jelsma wrote:
FYI: XInclude works fine. We have all request handlers in solrconfig in
separate files and include them via XInclude on a running SolrCloud cluster.
Good to know. I'm still deciding whether I want to recombine or
continue to use xinclude. Is the xi
My delta-import
(http://localhost:8983/solr/freemedia/dataimport?command=delta-import) does
not correctly update my solr fields.
Please see my data-config here:
Now when a new item is inserted into [freemedialikes]
an
Hi Marcel,
Are you committing data with hard commits or soft commits? I've seen systems
where we've inadvertently only used soft commits, which means that the entire
transaction log has to be re-read on startup, which can take a long time. Hard
commits flush indexed data to disk, and make it
On 11 January 2013 22:30, Jens Grivolla wrote:
[...]
> Actually, that is what you would get when doing a join in an RDBMS, the
> cross-product of your tables. This is NOT AT ALL what you typically do in
> Solr.
>
> Best start the other way around, think of Solr as a retrieval system, not a
> st
On 01/11/2013 05:23 PM, Gora Mohanty wrote:
You are still thinking of Solr as a RDBMS, where you should not
be. In your case, it is easiest to flatten out the data. This increases
the size of the index, but that should not really be of concern. As
your courses and languages tables are connected o
They point to the admin UI - or should - that seems right?
- Mark
On Jan 11, 2013, at 10:57 AM, Christopher Gross wrote:
> I've managed to get my SolrCloud set up to have 2 different indexes up and
> running. However, my URLs aren't right. They just point to
> http://server:port/solr, not htt
Hmm, you need to set up the HttpClient in HttpShardHandlerFactory but you
cannot access the HttpServletRequest from there, it is only available in
SolrDispatchFilter AFAIK. And then, the HttpServletRequest can only return the
remote user name, not the password he, she or it provided. I don't kno
Hello & thank you in advance for your help!,
*Context:*
I have implemented a custom search component that receives 3 parameters
field, termValue and payloadX.
The component should search for a termValue in the requested Lucene
field and for each *termValue* to check *payloadX* in its associated
On 11 January 2013 21:13, Niklas Langvig wrote:
> It sounds good not to use more than one core, for sure I do not want to over
> complicate this.
[...]
Yes, not only are multiple cores unnecessarily complicated here,
your searches will also be be less complex, and faster.
> Both table courses a
I changed it to only go to one Zookeeper (localhost:2181) and it still gave
me the same stack trace error.
I was eventually able to get around this -- I just used the "bootstrap"
arguments when starting up my Tomcat instances to push the configs over --
though I'd rather just do it externally from
FYI: XInclude works fine. We have all request handlers in solrconfig in
separate files and include them via XInclude on a running SolrCloud cluster.
-Original message-
> From:Mark Miller
> Sent: Fri 11-Jan-2013 17:13
> To: solr-user@lucene.apache.org
> Subject: Re: Setting up new SolrC
On Jan 10, 2013, at 12:06 PM, Shawn Heisey wrote:
> On 1/9/2013 8:54 PM, Mark Miller wrote:
>> I'd put everything into one. You can upload different named sets of config
>> files and point collections either to the same sets or different sets.
>>
>> You can really think about it the same way y
It's a bug that you only see RuntimeException - in 4.1 you will get the real
problem - which is likely around connecting to zookeeper. You might try with a
single zk host in the zk host string initially. That might make it easier to
track down why it won't connect. It's tough to diagnose because
It sounds good not to use more than one core, for sure I do not want to over
complicate this.
Yes I meant tables.
It's pretty simple.
Both table courses and languages has it's own primary key courseseqno and
languagesseqno
Both also have a foreign key "userid" that references the users table wi
On 11 January 2013 19:57, Niklas Langvig wrote:
> Ahh sorry,
> Now I understand,
> Ok seems like a good solution, I just know need to understand how to query
> multiple cores now :)
There is no need to use multiple cores in your setup. Going
back to your original problem statement, it can easily
I don't know how to query multiple cores and if it's possible at once, but
otherwise I would create a JOIN sql script if you need values from multiple
tables.
D.
On Fri, Jan 11, 2013 at 3:27 PM, Niklas Langvig <
niklas.lang...@globesoft.com> wrote:
> Ahh sorry,
> Now I understand,
> Ok seems l
Ahh sorry,
Now I understand,
Ok seems like a good solution, I just know need to understand how to query
multiple cores now :)
-Ursprungligt meddelande-
Från: Dariusz Borowski [mailto:darius...@gmail.com]
Skickat: den 11 januari 2013 15:15
Till: solr-user@lucene.apache.org
Ämne: Re: confi
Thanks Alex!
This brought me to the solution I wanted to achieve. :)
D.
On Thu, Jan 10, 2013 at 3:21 PM, Alexandre Rafalovitch
wrote:
> dataimport.properties is for DIH to store it's own properties for delta
> processing and things. Try solrcore.properties instead, as per recent
> discussion:
Hmmm, it will not work for me. I want the "original" credential
forwarded in the sub-requests. The credentials are mapped to permissions
(authorization), and basically I dont want a user to be able have
something done in the (automatically performed by the contacted
solr-node) sub-requests that
Hi,
No, it has actually two tables. User and Item. The example shown on the
blog is for one table, because you repeat the same thing for the other
table. Only your data-import.xml file changes. For the rest, just copy and
paste it in the conf directory. If you are running your solr in Linux, then
Hi Dariusz,
To me this example has one table "user" and I have many tables that connects
to one user and that is what I'm unsure how how to do.
/Niklas
-Ursprungligt meddelande-
Från: Dariusz Borowski [mailto:darius...@gmail.com]
Skickat: den 11 januari 2013 14:56
Till: solr-user@luce
Hmm noticed I wrote I have 3 columns: users, courses and languages
I ofcourse mean I have 3 tables: users, courses and languages
/Niklas
-Ursprungligt meddelande-
Från: Niklas Langvig [mailto:niklas.lang...@globesoft.com]
Skickat: den 11 januari 2013 14:19
Till: solr-user@lucene.apache.o
When thinkting some more,
Perhaps I could have coursename and such as multivalue?
Or should I have separate indeces for users, courses and languages?
I get the feeling both would work, but now sure which way is the best to go.
When a user is updating/removing/adding a course it would be nice to
Hi Niklas,
Maybe this link helps:
http://www.coderthing.com/solr-with-multicore-and-database-hook-part-1/
D.
On Fri, Jan 11, 2013 at 2:19 PM, Niklas Langvig <
niklas.lang...@globesoft.com> wrote:
> Hi!
> I'm quite new to solr and trying to understand how to create a schema from
> how our pos
Hi!
I know the pain! ;)
That's why I wrote a bit on a blog, so I could remember in the future. Here
is the link in case you would like to read a tutorial how to setup SOLR w/
multicore and hook it up to the database:
http://www.coderthing.com/solr-with-multicore-and-database-hook-part-1/
I hope
Hi,
We're experiencing slow startup times of searchers in Solr when containing a
large number of documents.
We use Solr v4.0 with Jetty and currently have 267.657.634 documents stored,
spread across 9 cores. These documents contain keywords, with additional
statistics, which we are using for s
Seams I'm to lazy.
I found this http://wiki.apache.org/solr/MergingSolrIndexes, and it works
rly.
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-removing-shard-how-to-not-loose-data-tp4032138p4032508.html
Sent from the Solr - User mailing list archive at Nabble.co
Hi!
I'm quite new to solr and trying to understand how to create a schema from how
our postgres database and then search for the content in solr instead of
querying the db.
My question should be really easy, it has most likely been asked many times but
still I'm not able to google any answer to
could you use field collapsing? Boost by date and only show one value
per group, and you'll have the most recent document only.
Upayavira
On Fri, Jan 11, 2013, at 01:10 PM, jmozah wrote:
> one crude way is first query and pick the latest date from the result
> then issue a query with q=timestamp[
one crude way is first query and pick the latest date from the result
then issue a query with q=timestamp[latestDate TO latestDate]
But i dont want to execute two queries...
./zahoor
On 11-Jan-2013, at 6:37 PM, jmozah wrote:
>
>
>
>> What do you want?
>> 'the most recent ones' or '**only**
> What do you want?
> 'the most recent ones' or '**only** the latest' ?
>
> Perhaps a range query "q=timestamp:[refdate TO NOW]" will match your needs.
>
> Uwe
>
I need **only** the latest documents...
in the above query , "refdate" can vary based on the query.
./zahoor
Hi,
If your credentials are fixed i would configure username:password in your
request handler's shardHandlerFactory configuration section and then modify
HttpShardHandlerFactory.init() to create a HttpClient with an AuthScope
configured with those settings.
I don't think you can obtain the ori
hello.
Which is the best/fastest way to get the value of many fields from index?
My problem is, that i need to calculate a sum of amounts. this amount is in
my index (stored="true"). my php script get all values with paging. but if a
request takes too long, jetty is killing this process of "expor
in solrconfig.xml
edismax
text^0.5 last_name^1.0 first_name^1.2 course_name^7.0 id^10.0
branch_name^1.1 hq_passout_year^1.4
course_type^10.0 institute_name^5.0 qualification_type^5.0
mail^2.0 state_name^1.0
text
100%
*:*
10
Hi
I read http://wiki.apache.org/solr/SolrSecurity and know a lot about
webcontainer authentication and authorization. Im sure I will be able to
set it up so that each solr-node is will require HTTP authentication for
(selected) incoming requests.
But solr-nodes also make requests among each
Am 10.01.2013 11:54, schrieb jmozah:
I need a query that matches only the most recent ones...
Because my stats depend on it..
But I have a requirement to show **only** the latest documents and the
"stats" along with it..
What do you want?
'the most recent ones' or '**only** the latest' ?
Perh
Mark, I know i still have access to data and i can woke ap shard again.
What i want to do is.
I have 3 shards on 3 nodes, one on each. Now i discower that i dont need 3
nodes and i want only 2.
So i want to remove shard and put data from it to these who left.
Is there way to index that data wit
58 matches
Mail list logo