Re: Could someone do a presentation on Trendulo at Accumulo Summit 2015?

2014-12-03 Thread Jared Winick
As Michael notes, that project hasn't been updated in a while. That said, I was actually going to be submitting a proposal for a talk on building aggregation systems on Accumulo (sorry if I just ruined any sort of blind review process). As Trendulo is basically just a word count + time dimension ag

Re: Program runs fine in NetBeans fails the shutdown properly running via Maven

2014-11-30 Thread Jared Winick
Please only use the "jaredwinick" package name for code that works perfectly. Thanks. ;-) On Sun, Nov 30, 2014 at 5:34 PM, Josh Elser wrote: > Great. Glad it's working as expected. > > > David Medinets wrote: > >> Of course I did. Using CleanUp.shutdownNow(); works perfectly from >> inside the "

Re: Z-Curve/Hilbert Curve

2014-07-28 Thread Jared Winick
As several people have commented, a single range for a query can produce a lot of false positives that need to be filtered out. I had made this visualization a while back that lets you interactively (click-drag a bounding box) see this behavior. http://bl.ocks.org/jaredwinick/5073432 On Sun, J

Re: Out of Memory in Embedded Jetty

2014-05-11 Thread Jared Winick
You are likely running into what many others have who are running the Accumulo client from application servers and is documented in https://issues.apache.org/jira/browse/ACCUMULO-1379. Are you able to upgrade to 1.5.1? If so, check out the cleanup methods that were added as part of https://issues.a

Re: Accumulo and Spark

2014-01-15 Thread Jared Winick
I tried myself a few weeks ago and saw that it "just works" too for the very simple test I ran. I did see some error messages when running from sbt after the job successfully completed and the SparkContext was closing. I assume this has to do with resources within the AccumuloInputFormat? This was

Re: How to get count of table rows using accumulo shell

2013-10-11 Thread Jared Winick
After following the commands Eric lists to set the iterator for that table, instead of running 'egrep' in the shell, you could do this from the Linux command line accumulo shell -u username -p password -e "scan -t foo" | wc -l On Fri, Oct 11, 2013 at 11:42 AM, Eric Newton wrote: > You can stac

Re: Accumulo init over existing instance

2013-10-08 Thread Jared Winick
In my experience, you need to remove the accumulo directory in HDFS (hadoop fs -rmr /accumulo) before "accumulo init" will allow you to proceed. That is all you should have to do. Jared On Tue, Oct 8, 2013 at 3:32 PM, Terry P. wrote: > So reverse DNS wasn't working when I deployed my new clust

Re: Storing, Indexing, and Querying data in Accumulo (geo + timeseries)

2013-06-17 Thread Jared Winick
Have you considered a "geohash" of all 3 dimensions together and using that as the RowID? I have never implemented a geohash exactly, but I do know it is possible to build a z-order curve on more than 2 dimensions, which may be what you want considering that it sounds like all your queries are in 3

Re: Should I store Long values as String or Long?

2013-05-13 Thread Jared Winick
I believe the feature John is referring to above is the Formatter interface (org.apache.accumulo.core.util.format.Formatter). You can implement this interface to convert key/values to a more human readable format for the shell. You can drop a JAR file containing your implementation into lib/ext jus

Re: interesting

2013-05-03 Thread Jared Winick
That is very interesting and sounds like a fun friday project! Could you please elaborate on how you mapped the original format of ngram TAB year TAB match_count TAB volume_count NEWLINE into Accumulo key/values? Could you briefly explain what feature in Accumulo is responsible for this improveme

Re: Suggestions on modeling a composite row key

2013-02-27 Thread Jared Winick
And if you weren't already aware, if you do something like Christopher mentions, or anything that makes your Keys less than human friendly, check out the Formatter interface http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/util/format/Formatter.html. This will let you write a Formatt

Re: Accumulo on CDH4

2012-07-02 Thread Jared winick
For those interested in tracking this, the compatibility with Hadoop 0.23 is documented at https://issues.apache.org/jira/browse/ACCUMULO-564 On Sun, Jul 1, 2012 at 10:14 AM, John Vines wrote: > CDH4 is based on hadoop 0.23/2.0, which is substantially different from > CDH3/hadoop 0.20.2/hadoop 1

Re: Scan and display entire row

2012-05-24 Thread Jared winick
The Key ( http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/data/Key.html) has methods to get the row, column family, and column qualifier, so that would give you all the information you see printed in the Accumulo shell. Like John says, the Key is not just a row ID. See the "Data Mo

Re: Trendulo - A Twitter Analytics Demo on Accumulo

2012-04-30 Thread Jared winick
ine there are a lot of entries with a very small count as the language used on Twitter is far from normal. On Fri, Apr 27, 2012 at 1:09 PM, Eric Newton wrote: > > On Wed, Apr 25, 2012 at 3:10 PM, Jared winick wrote: > >> I am not exactly sure how to answer the question about storage s

Re: Trendulo - A Twitter Analytics Demo on Accumulo

2012-04-25 Thread Jared winick
are you using EBS or local instance storage? > > On Apr 25, 2012, at 8:52 AM, Eric Newton wrote: > > How many key-values does a single tweet become, on average? What's the > storage size per tweet? > > On Wed, Apr 25, 2012 at 12:17 AM, Jared winick wrote: > >> Thanks fo

Re: Trendulo - A Twitter Analytics Demo on Accumulo

2012-04-24 Thread Jared winick
, just let me know or add them as issues on the github page. On Tue, Apr 24, 2012 at 11:40 AM, Billie J Rinaldi wrote: > That's so cool that I'm creating a new section for it on our page of links: > http://accumulo.apache.org/papers.html > > Billie > > On Tuesday, April

Trendulo - A Twitter Analytics Demo on Accumulo

2012-04-24 Thread Jared winick
I gave an Introduction to Apache Accumulo presentation last month at the Boulder/Denver Meetup where I demoed an application that used Accumulo to provide real-time

Re: Zookeeper ConnectionLossException

2012-03-30 Thread Jared winick
ault), then you will most likely lose > the server.  You will notice this happening when the freemem approaches > totalmem. > > I don't have much experience running Accumulo on VMs, but I have seen VMs > have strange behavior with respect to timekeeping.  That might be another >