Thanks, Bruno and Matthew. I saw that tutorial before and Lingpipe requires
a license while we are looking at open source solutions. We are not clear
yet on how to use Solr to do sentiment analysis. Does a NLP or learning tool
have to be used to accomplish this task? If a tool is needed, how it can
My organization is considering a few different approaches for indexing
vs query rewrite and I'm trying to figure out what would be required
in order to implement some form of query rewrite. Lets say my index
has 2 fields first name and last name. When the user does a query
name:bob I'd like to tr
Hi there,
For local business searches in big cities (e.g., Restaurant in NYC), I’d like
to sort the results by the density of the businesses in the underlying
neighborhoods (e.g., return x restaurants from the neighborhood that has the
highest density of restaurants).
A solution would be to se
Hi there,
Through SolrJ 3.2, I'm trying to set some Spatial Search queries (e.g., filter
by distance, sort by distance, etc.). I don’t know whether there's a specific
SolrJ syntax to do this. I tried using Strings, but it’s not working. Here’s
are two examples that work fine on Solr, but don’t
Now I've pasted sample solrconfig.xml to the project top page.
Can you visit and look at it again?
koji
--
http://www.rondhuit.com/en/
(11/07/09 2:29), Sowmya V.B. wrote:
Hi Koji
Thanks. I have checked out the code and began looking at it. The code
examples gave me an idea of what to do,though
We are trying to implement browsing based on the search functionality
(Which will have facet for that particular category along with item
list).
For this purpose we have added one field in the feed (.csv) file which
we use to create indexes so that we can do the search based on the
subcategory id
This maybe be a simple question; well I hope so anyways. We have songs that
punctuation and quoting and the trick is to get all variations of a query to
result with the correct result. Please see the following example.
>From the database we index a song with title "Damon's Radical Song?". We wa
And don't you know, that EdgeNGram analyzer did the trick. Added the fieldtype,
added a new field based on it, copyfielded the old title to it, reindexed and
hey - it works brilliantly :)
And you were right, the analysis output does make sence once it actually
matches something :D
Thanks a mil
Note you can't use lingpipe commercially without a license though I believe.
Sent from my iPhone
On 8 Jul 2011, at 18:20, Bruno Adam Osiek wrote:
> Try Lingpipe. They use Language Models as their engine for sentiment
> analysis. At (http://alias-i.com/lingpipe/) you will find a step-by-step
>
> You can add the "indent=true" parameter to the request to get a tidier
> output. Firefox usually ignores tabs when showing XML, so I'd suggest to
> choose "View page source" in that case.
>
The page source looks so much better. :) thanks!
> The documentation seems to suggest to have stored=true
Wanted to say thanks to everyone contributed: Erick, Stefan, kenf_nc.
Erick a solution based on your proposition has been implemented and
pushed to users. Thanks!
Best
On 5/19/11, Erick Erickson wrote:
> Oooh, that's clever
>
> The glitch is that field collapsing is scheduled for 3.2, but th
Nope, that should do it (although I haven't tried that
exact set of steps). But you do have to reindex
from scratch
Best
Erick
On Fri, Jul 8, 2011 at 1:36 PM, Christopher Cato
wrote:
> Thanks for that pointer, that's really more what I want to do. And actually,
> EdgeNGrams is stuck somewh
Hi Gora,
The problem I am finding is that the copyField directive sends the original
value to the new field type.
The field type then munges the index until it's completely different
(original -> some sentence this like, index -> true), but the stored value
is still the original sentence.
When I
Thanks for that pointer, that's really more what I want to do. And actually,
EdgeNGrams is stuck somewhere in the back of my head :) Yes, simple at first
thought but not as easy to implement as I have discovered.
Well, so how do I implement something like this? I took the fieldtype
declaration
Hi Koji
Thanks. I have checked out the code and began looking at it. The code
examples gave me an idea of what to do,though I am not fully clear, since
there are no comments there, to verify my understanding. Hence, mailing
again for clarification.
In NamedEntity.java, you add two fields "name",
Thanks, MMapDirectory was the reason - it was made the default in Lucene 3.3
http://lucene.apache.org/java/3_3_0/changes/Changes.html#3.3.0.changes_in_runtime_behavior
https://issues.apache.org/jira/browse/LUCENE-3198
From: Toke Eskildsen
To: "solr-user@lucene.a
Try Lingpipe. They use Language Models as their engine for sentiment
analysis. At (http://alias-i.com/lingpipe/) you will find a step-by-step
tutorial on how to implement it.
On 07/08/2011 07:14 AM, Zheng Qin wrote:
Hi,
We are starting a project on Twitter data sentiment analysis. We have
ins
Yeah, the analysis page takes a bit of getting used to, but it's well
worth the time. Be sure to check the "verbose" box. Taking some time
to understand what it's telling you is one of the best investments
you'll make.
Your "parts of words" is the issue. One approach is to use ngrams or
edgengrams
Hi Briggs, thanks for being patient with me!
Yeah, I saw I had a typo there in the OR clause. Fixed it but still no perfect
results.
I'm looking at the analysis.jsp page and can't really figure it out. Feeling a
bit overwhelmed by all the output. I also don't know how to check if stemming
is us
Hi Elain,
You can add the "indent=true" parameter to the request to get a tidier
output. Firefox usually ignores tabs when showing XML, so I'd suggest to
choose "View page source" in that case.
The documentation seems to suggest to have stored=true for the fields
> though, not sure why.
>
Maybe
On 7/8/2011 8:08 AM, Elaine Li wrote:
Guan and Koji, thank you both!
After I changed to termVectors = true, it returns the results as expected.
I flipped the stored=true|false for two fields: text and category_text
and compared the results and don't see any difference. The
documentation seems to
Hi Sowmya,
I basically wrote an annotator and built a buffering tokenizer around it
so I could include it in a Lucene analyzer pipeline. I've blogged about
it, not sure if its good form to include links to blog posts in public
forums, but here they are, apologies in advance if this is wrong (let m
What browser are you using? Chrome and FireFox (and I think IE) have
plugins that'll format XML and JSON right in the browser that helps with
this a lot.
Best
Erick
On Fri, Jul 8, 2011 at 10:08 AM, Elaine Li wrote:
> Guan and Koji, thank you both!
>
> After I changed to termVectors = true, it re
Well, it depends (tm). Raw search time should be unaffected (or very
close to that). The stored data is in a completely separate file in
the index directory and is not referenced during searches.
That said, assembling the response may take longer since you're
potentially reading more data from the
Hey Chris,
Removing the ORs in each query might help narrow down the problem, but I
suggest you run this through the query analyzer in order to see where it is
dropping out. It is a great tool for troubleshooting issues like these.
I see a few things here.
- for leading wildcard queries, you s
On Fri, Jul 8, 2011 at 4:11 AM, Thomas Heigl wrote:
> How should I proceed with this problem? Should I create a JIRA issue or
> should I cross-post on the dev mailing list? Any suggestions?
Yes, this definitely sounds like a bug in the 3.3 grouping (looks like
it forgets to weight the sorts).
Cou
Guan and Koji, thank you both!
After I changed to termVectors = true, it returns the results as expected.
I flipped the stored=true|false for two fields: text and category_text
and compared the results and don't see any difference. The
documentation seems to suggest to have stored=true for the fie
(11/07/08 16:19), Sowmya V.B. wrote:
Hi Koji
Thanks for the mail.
Thanks for all the clarifications. I am now using the version 3.3.. But,
another query that I have about this is:
How can I add an annotator that I wrote myself, in to Solr-UIMA?
Here is what I did before I moved to Solr:
I wrot
Hi,
i have a problem with omitTermFreqAndPosition and omitNorms.
In my schema i have some fields with these property set True.
for example the field "category"
then i make a query like:
select?q=category:("x" OR "y" or "Z")
it returns all docs that have as category x or y or z.
i make a debugQue
Hi Briggs. Thanks for taking the time. I have the query nearly working now,
currently this is how it looks when it matches on the title "Super Technocrane
30" and others with similar names:
INFO: [] webapp=/solr path=/select/
params={qf=title^40.0&hl.fl=title&wt=json&rows=10&fl=*,score&start=0&
i would prefer every setting to be in its default stage and compare the
result with stored = true and False .
2011/7/8 François Schiettecatte
> Hi
>
> I don't think that anyone has run such benchmarks, in fact this topic came
> up two weeks ago and I volunteered some time to do that because I ha
Hi
I don't think that anyone has run such benchmarks, in fact this topic came up
two weeks ago and I volunteered some time to do that because I have some spare
time this week, so I am going to run some benchmarks this weekend and report
back.
The machine I have to do this a core i7 960, 24GB,
> Okey just to make sure, correct connector should be this:
>
> connectionTimeout="2"
> redirectPort="8443"
> URIEncoding="UTF-8" />
>
>Can you confirm this? Did you restart tomcat?
This is my connector:
Yes, I'd resta
> I've changed the server.xml to add the URI Enconding. I've
> changed the schema version to 1.4. And I've reindexed my DB.
> But nothing has changed.
Okey just to make sure, correct connector should be this:
Can you confirm this? Did you restart tomcat?
Also can you paste the output of &de
Hi,
We are starting a project on Twitter data sentiment analysis. We have
installed LucidWorks, which also has a Solr admin page. By reading the
posts, it seems that sentiment analysis can be done by using OpenNLP or
machine learning (Mahout or Weka). Can you share with us which tool is good
at cl
I'm sorry if this mail is repeated. But my server mail gave me an error.
Hi!
I've changed the server.xml to add the URI Enconding. I've changed the schema
version to 1.4. And I've reindexed my DB. But nothing has changed.
In the analisys.jsp I've searched for "más", in order to find what happen
On Fri, 2011-07-08 at 07:12 +0200, Nikhil Chhaochharia wrote:
> However, if I upgrade to Solr 3.3, then the Virtual Memory of the Tomcat
> process increases to roughly the index size (70GB). Any ideas why
> this is happening?
Maybe you switched to MMapDirectory?
http://lucene.apache.org/java/3_
Hi!
I've changed the server.xml to add the URI Enconding. I've changed the schema
version to 1.4. And I've reindexed my DB. But nothing has changed.
In the analisys.jsp I've searched for "más", in order to find what happens with
that word, and it's also recognized as two characters, just like "
How should I proceed with this problem? Should I create a JIRA issue or
should I cross-post on the dev mailing list? Any suggestions?
Cheers,
Thomas
On Wed, Jul 6, 2011 at 9:49 AM, Thomas Heigl wrote:
> My query in the unit test looks like this:
>
> q=*:*&fq=_query_:"{!geofilt sfield=user.loca
hi,
is there any performance degradation (response time etc ) if the index has
document content text stored in it (stored=true)?
-JAME
> Hello, I am using solr search. my
> search field contains both "diamond" and
> "Diamond".
> But when i search for Diamond/diamond it gives me correct
> results. But when
> i search for Diamond*/diamond*, I get result for diamond*
> but no results
> found for Diamond* . although i have applied
>
Hello,
As I see from analyis.jsp your á letter is not converted to 'a' by ASCII
folding filter. It is recognized as two characters 'á' (before it comes to
ASCII folding) for some reason.
First of all I would check URI Encoding of my servlet container. It should be
utf-8. See tomcat's config
Setup the filter on query and indexing to make it case insensitive...
Then reindex.
On Fri, Jul 8, 2011 at 1:26 AM, Romi wrote:
> Hello, I am using solr search. my search field contains both "diamond" and
> "Diamond".
> But when i search for Diamond/diamond it gives me correct results. But when
>
I'm using collectiveaccess, and its DB structure. Perhaps this is useful...
My type definition is:
Hello, I am using solr search. my search field contains both "diamond" and
"Diamond".
But when i search for Diamond/diamond it gives me correct results. But when
i search for Diamond*/diamond*, I get result for diamond* but no results
found for Diamond* . although i have applied .
would you plea
Hi Koji
Thanks for the mail.
Thanks for all the clarifications. I am now using the version 3.3.. But,
another query that I have about this is:
How can I add an annotator that I wrote myself, in to Solr-UIMA?
Here is what I did before I moved to Solr:
I wrote an annotator (which worked when I use
Yes i use something like that. I make a db connection to get the facets for
the chosen category. With this data i add facet.fields dynamically:
example:
foreach(results as result){
qStr = "facet.field=" . result;
}
I was searching for a solution that i don't need to get the facets from db.
Now i
47 matches
Mail list logo