I want to store a bunch of documents in elasticsearch (which represent a
hit to a website) including the user agent of the client that made the
original HTTP request.
Since user agent strings have a lot of variance, and the useful parts need
parsing out (OS, browser, version etc.) I would like
Hi,
You should give http://logstash.net/docs/1.4.2/filters/useragent a try before
anything else.
Here is the relevant part of logstash.conf I'm using:
filter {
if [type] == apache {
if [user-agent] != - and [user-agent] != {
useragent {
Ok, thanks again for the help!
On Wednesday, June 25, 2014 3:37:00 PM UTC+2, Cédric Hourcade wrote:
In fact they are in the _all field, but not analyzed with your
trigrams analyzer.
Cédric Hourcade
c...@wal.fr javascript:
On Wed, Jun 25, 2014 at 3:12 PM, Andreas Falk adde...@gmail.com
Hi,
We're now in performance test and seeing some unexpected result.
We use Java percolate API
client.preparePercolate().setIndices(index).setDocumentType(projectName).setSource(log).execute().actionGet();
LOGGER.info(duration+ms for percolation, es time
+response.getTookInMillis()+ ms for log
Hello everyone,
Sorry for the recent spam. User banned, and I'm heading clean up the
online archives now.
Cheers,
LH
--
Leslie Hawthorn
Community Manager
http://elasticsearch.com
Other Places to Find Me:
Freenode: lh
Twitter: @lhawthorn
Skype: mebelh
Voice: +31 20 794 7300
--
You received
Thanks for your quick reply,
I need some clarifications about what you meant by delete the river,
delete the _river index and by this state is useful for flow control.
From what I have understand from your reply and supposing that I have
imported data into a documents river using the JDBC
Hi,
We have recently migrated our application from 'bare Lucene + Zoie for
realtime search' to Elastic Search. Elastic search is awesome and next to
scalability, it gives us lots of additional features. The one thing we
really miss though is realtime search.
Search is the core of our
Am Mittwoch, 25. Juni 2014 11:50:42 UTC+2 schrieb rayman:
We are thinking of using Elastic-search as our primary database.
But i am concerned about few things:
1. If we need to modify a document type(let's say add new field) we will
need to re-index all raw data. therefor we need to keep
Hi,
Our current mapping does not support _all field, is their a way to update
the mapping to include it or is re indexing required?
Regards
Shawn
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop
Yes, removing a river is DELETE _river/rivername and deleting river index
is DELETE _river
The JDBC river state keeps track of some timestamps, counters, and the last
row of SQL statement. Yes, in case of a node switchover, where the river
instance is restarted on another node, the new node could
Added the Solr benchmark as well:
Number of different meta data field
ES with disable _all/codec bloom filter
ES (Ingestion Query concurrently)
Solr
Solr(Ingestion Query concurrently)
Scenario 0: 1000
13 secs -769 docs/sec
CPU: 23.68%
iowait: 0.01%
Heap: 1.31G
Index Size: 248K
Ingestion
Zoie is not for distributed search. If you want to analyze the LinkedIn
developments for this area with Lucene, you should look at Sensei
There was also a BalancedSegmentMergePolicy donated to Lucene 2.x from the
Zoie project
https://issues.apache.org/jira/browse/LUCENE-1924
but there was not
Did you get this working in the end Maarten?
I have the same problem with the way 'intersects' works and Jilles's
solution doesn't work for me; possibly due to the 'tree_levels' accuracy
for quad tree.
As a kind of workaround, I was thinking that you could draw 2 'envelope'
geo_shape
Hm, I encounter strange scoring results I do not understand I tracked
down the scoring and it seems like the 'queryWeight' is missing sometimes.
thats what explain give me for one document:
{
value: 8.252264,
description: weight(collector_1.default.raw:salzburg^18.0 in
11412869)
need it asap, please share me somebody
river-mongodb don't work with 1.2.1
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
There's a critical bug with 1.2.0 which is why it was removed.
See http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 26 June 2014 21:38, Антон Кикоть
it is cool Man, but i have production server with elasticsearch (v 1.2.1)
and i need elasticsearch-river-mongodb but when i try to setup river in new
version ES it is don't work:
{1.2.1}: Initialization Failed ...
- ExecutionError[java.lang.NoClassDefFoundError:
Very few GC messages in the logs, and none around the OOM instances...
Cheers,
-Robin-
On Wednesday, 25 June 2014 16:55:03 UTC+2, Michael Hart wrote:
What does your GC Old Count and GC Old Duration look like? Do you have
warnings in the logs about long GC's?
I've got similar issues and
It looks like a bug to me. I think you should open issue and add all those
details in.
Best
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 26 juin 2014 à 13:53:03, Tanguy Moal (tanguy.m...@gmail.com) a écrit:
Dear group,
I'm experiencing an issue
It sounds like it has not been updated for es 1.2:
https://github.com/richardwilly98/elasticsearch-river-mongodb/pull/283
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 26 juin 2014 à 13:51:25, Антон Кикоть (antony.ki...@gmail.com) a écrit:
it is
Hi David and Jörg,
Many thanks for your help. Finally I can found the problem. It was in the
version of the postgres driver.
Kind regards!!!
Jorge von Rudno
2014-06-25 18:01 GMT+02:00 joergpra...@gmail.com joergpra...@gmail.com:
You did not specify an index for the JDBC river to index to,
Hi Richard,
thanks for your answer, it for sure helped! Still, I am puzzling with a few
effects and questions:
1.) I am a bit confused by your class/instance idea. I can do something
pretty simple like class { 'elasticsearch' : version = '0.90.7' } and it
will install elasticsearch in the
hi i try to do, to the same the link
https://github.com/richardwilly98/elasticsearch-river-mongodb/pull/283
https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Frichardwilly98%2Felasticsearch-river-mongodb%2Fpull%2F283sa=Dsntz=1usg=AFQjCNGaUmM2RZAnuGNK3qnMeL2hlf5A9w
but nothing change =((
Just for what itś worth: we have been using ES as our primary datastore for
almost 2 years. so far so good.
I think that the blog post you are referring to is *very* interesting *but*
at the same time, think about how many sql databases out there are not even
backed-up in production... are they
I've run into this problem because my /tmp is mounted as noexec.
Worked around this using these java opts:
-Djna.tmpdir=/usr/share/elasticsearch/tmp
-Djava.io.tmpdir=/usr/share/elasticsearch/tmp
Cheers
On Wednesday, June 25, 2014 7:28:17 PM UTC-3, dup90011 wrote:
Centos 6(64)
Tried to
I switched up my importing of a csv file from doing single inserts to a
bulk insert but I'm not sure why all my document are nested in a doc field
instead of inserted as value. There is no 'doc' field in the dataset so I'm
not sure where that value is coming from.
ie:
hits: [
{
See PR #3278. Hopefully it will get merged into one of the next releases.
https://github.com/elasticsearch/elasticsearch/pull/3278
Thanks,
Matt Weber
On Thu, Jun 26, 2014 at 12:10 AM, Thomas thomas.bo...@gmail.com wrote:
Hi,
Unfortunately this is not supported by elasticsearch, the
Hi,
Could anyone help please? We're kind of stuck right now - trying to get to
a point that we demonstrate ES working for our use-cases to get management
blessing.
thanks
Fuzz.
On Tuesday, 24 June 2014 14:43:01 UTC+1, dazraf wrote:
Hi,
Very grateful for any help with the following (rather
other unexpected results arise due to different queryNorms:
for the first result i get a query norm:
{
value: 0.0059806756,
description: queryNorm
}
for some other documents it's:
{
value: 0.0031318406,
description: queryNorm
}
the querynorm is multiplied to create the score, so it
Thank you for your suggestion. I tried the stream2es library but I get a
OutOfMemoryError when trying to use that.
On Friday, June 6, 2014 5:13:19 PM UTC-4, Antonio Augusto Santos wrote:
Take a look at stream2es https://github.com/elasticsearch/stream2es
On Friday, June 6, 2014 2:13:06 PM
Is this related to this issue?
https://github.com/elasticsearch/elasticsearch/issues/3022
On Tuesday, 24 June 2014 14:43:01 UTC+1, dazraf wrote:
Hi,
Very grateful for any help with the following (rather urgent) issue.
Gist: https://gist.github.com/dazraf/55ebb900b3c17583bf58
The script
Hey all,
We are considering building a fan-out feed inbox type system on top of ES
for Reverb.com. The way it would work is each user can follow some number
of searches. Using the percolator, we would plop new items as they matched
searches into individual user feeds. We are going to have the
Hi there,
Is there a way to calculate the average over the doc_count result of a
bucket aggregation?
For instance, I have this aggregation query:
GET channel/Subscription/_search
{
size: 1,
aggs: {
SubscriptionsPerUser: {
terms: {
field: UserId,
min_doc_count: 0,
Im looking for some pointers on how to debug my issue: my php scripts
works fine when run from the command line but somehow crashes when served
by apache (Im using an EC2 instance that runs apache and elasticsearch on
ubuntu)
It seems $client = new Elasticsearch\Client() fails when served by
Hey Guys,
I'm working on an analytics dashboard project where we collect events into
Elasticsearch for clients. Each client could have millions of events per month.
We are thinking of using one index with one shard and one replica per client.
Looking at Logstash, it seems like Logstash creates
Hmmm ...
How /tmp affected link C library?
My /tmp is fine
Any elasticsearch support exists here or only for paid version?
--
View this message in context:
From the data you have provided I see that your bucket and keys for
development and production are different. Point your development
elasticsearch instance to the same AWS account and bucket in which you are
storing the snapshot.
On Jun 26, 2014 9:15 PM, Brian Lamb brian.l...@researchsquare.com
Check *mlockall *section on
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html
The ES devs are around and they may answer your question. But real support
only on paid version.
On Thu, Jun 26, 2014 at 4:11 PM, dup90011 regis...@xdbr.com wrote:
Hmmm
oh well - curl was not installed it seems. I installed curl, restarted
apache and now everything works fine...
X
On Thursday, June 26, 2014 10:16:21 AM UTC-7, Xavier Sorgel wrote:
Im looking for some pointers on how to debug my issue: my php scripts
works fine when run from the command line
Drew,
The Elasticsearch default is to create 5 shards for each index. I would start
with this. Typically it is best to actually over-shard, which is to say have
more than 1 shard per node per index. There is not really any measurable cost
to this and it gives you flexibility in your design as
ES runs on RHEL/Centos 6.
What exact RHEL/Centos 6 is this? Find out with command
cat /etc/redhat-release
What Java JVM do you use?
What file system do you have ES installed on? It must have permission to
execute binaries.
Also you can run this little Java program to find the reason:
Save
Hi!
Have there been any further explorations in the area of wan replication?
I have ES clusters in multiple datacenters connected via high-speed private
network. I'm wondering if multi-master replication would be possible in
this environment or if we'd need some type of 'shovel' plugin like
Hello
I have the following data
[
{
id: 1,
collection: [
{
key: val1
},
{
key: val2
},
{
key: val3
}
]
},
{
id: 2,
Hi all, first time Elasticsearch user.
I am using Elastic as part of running SugarCRM 7 on-site, as it is a
pre-req. I have a clean Ubuntu 14.04 install with a LAMP stack and Java
installed. I have installed Elasticsearch using apt-get based on the
repository instructions on the Elastic
Which index is growing?
Chances are it is the marvel index(es), which is expected.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 27 June 2014 10:07, Grant Christensen grant.christen...@supercorp.com.au
wrote:
Yes that is correct. Looks like a daily Marvel index is growing. Is this the
plugin grabbing stats?
Grant Christensen
General Manager - Sales and Product
[More about Supercorp]http://www.supercorp.com.au/
e:
grant.christen...@supercorp.com.aumailto:%20grant.christen...@supercorp.com.au
|
*Be smart people please.Do not believe the person who gave you the fake
link.here is reliable as the trusted standard for watch box office
movies.just click for watch or
Sure is.
There's more info in the docs
http://www.elasticsearch.org/guide/en/marvel/current/
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 27 June 2014 10:11, Grant Christensen grant.christen...@supercorp.com.au
Hi Andrew,
Not sure if you read my original question. The question is about having a
separate index per customer since we are going to have 1000 customers but
each would have a lot of data. Each shard comes with it's own overhead since
it's an instance of Lucene. I was going with the 1 shard
Pretty sure he read it as I'd have offered the same advice :)
You cannot change the sharding of an index after creation, you need to
completely reindex the data to do so. This may not be a major issue for you
but it's something to take into account when you have hundreds or thousands
of customers,
Hi Mark,
The problem that we have is that each customer could generate 60-80 million
docs/month on average. In addition, when a customer leaves, we would need to
delete all their data. So hence it makes sense to have an index per customer
(or even multiple indexes per customer). Another issue
Thanks Matt, that feature is exactly what we need. One thing I couldn't figure
out was that I would be able to pass a routing key so only relevant shards
would be queried, right?
On Jun 26, 2014, at 8:14 AM, Matt Weber matt.we...@gmail.com wrote:
See PR #3278. Hopefully it will get merged
I have not tested routing but I did put that functionality in so it should
work fine. Let me know if you have any issues!
Thanks,
Matt Weber
On Thu, Jun 26, 2014 at 7:20 PM, Drew Kutcharian d...@venarc.com wrote:
Thanks Matt, that feature is exactly what we need. One thing I couldn’t
Hi All,
I am working on a project with elasticsearch and require the top_hits
aggregation. With maven central having only upto 1.2.0, I am currently
unable to test/develop the module that requires top_hits aggregation. This
is the key feature that made us to move to elasticsearch. If there
Ahh ok, knowing this extra info is good as it helps us help you :)
Logstash doesn't define how many shards to use, at least not that I can see
here -
https://github.com/elasticsearch/logstash/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template.json
-
or through some quick tests.
You will find it in sonatype repo.
HTH
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 27 juin 2014 à 06:49, Veerapuram Varadhan v.varad...@gmail.com a écrit :
Hi All,
I am working on a project with elasticsearch and require the top_hits
aggregation. With maven
58 matches
Mail list logo