Re: how to get actual count from as long from JavaDStream ?

2014-09-30 Thread Andy Davidson
be changed for Java and probably the function argument syntax is wrong too, but hopefully there's enough there to help. Jon On Tue, Sep 30, 2014 at 3:42 PM, Andy Davidson a...@santacruzintegration.com wrote: Hi I have a simple streaming app. All I want to do is figure out how many lines I

Re: iPython notebook ec2 cluster matlabplot not found?

2014-09-29 Thread Andy Davidson
, that will be problematic. Finally, there is an open pull request https://github.com/apache/spark/pull/2554 related to IPython that may be relevant, though I haven’t looked at it too closely. Nick ​ On Sat, Sep 27, 2014 at 7:33 PM, Andy Davidson a...@santacruzintegration.com wrote: Hi I am having

Re: iPython notebook ec2 cluster matlabplot not found?

2014-09-29 Thread Andy Davidson
it to the slaves, that will be problematic. Finally, there is an open pull request https://github.com/apache/spark/pull/2554 related to IPython that may be relevant, though I haven’t looked at it too closely. Nick ​ On Sat, Sep 27, 2014 at 7:33 PM, Andy Davidson

newbie system architecture problem, trouble using streaming and RDD.pipe()

2014-09-29 Thread Andy Davidson
Hello I am trying to build a system that does a very simple calculation on a stream and displays the results in a graph that I want to update the graph every second or so. I think I have a fundamental mis understanding about how steams and rdd.pipe() works. I want to do the data visualization

iPython notebook ec2 cluster matlabplot not found?

2014-09-27 Thread Andy Davidson
Hi I am having a heck of time trying to get python to work correctly on my cluster created using the spark-ec2 script The following link was really helpful https://issues.apache.org/jira/browse/SPARK-922 I am still running into problem with matplotlib. (it works fine on my mac). I can not

problem with spark-ec2 launch script Re: spark-ec2 ERROR: Line magic function `%matplotlib` not found

2014-09-26 Thread Andy Davidson
`%matplotlib` not found Maybe you have Python 2.7 on master but Python 2.6 in cluster, you should upgrade python to 2.7 in cluster, or use python 2.6 in master by set PYSPARK_PYTHON=python2.6 On Thu, Sep 25, 2014 at 5:11 PM, Andy Davidson a...@santacruzintegration.com wrote: Hi I am

Re: problem with spark-ec2 launch script Re: spark-ec2 ERROR: Line magic function `%matplotlib` not found

2014-09-26 Thread Andy Davidson
or not, but if you want to try manually upgrading Python on a cluster launched by spark-ec2, there are some instructions in the comments here https://issues.apache.org/jira/browse/SPARK-922 for doing so. Nick ​ On Fri, Sep 26, 2014 at 2:18 PM, Andy Davidson a...@santacruzintegration.com wrote

spark-ec2 ERROR: Line magic function `%matplotlib` not found

2014-09-25 Thread Andy Davidson
Hi I am running into trouble using iPython notebook on my cluster. Use the following command to set the cluster up $ ./spark-ec2 --key-pair=$KEY_PAIR --identity-file=$KEY_FILE --region=$REGION --slaves=$NUM_SLAVES launch $CLUSTER_NAME On master I launch python as follows $

RDD pipe example. Is this a bug or a feature?

2014-09-19 Thread Andy Davidson
Hi I am wrote a little java job to try and figure out how RDD pipe works. Bellow is my test shell script. If in the script I turn on debugging I get output. In my console. If debugging is turned off in the shell script, I do not see anything in my console. Is this a bug or feature? I am running

Re: spark-1.1.0-bin-hadoop2.4 java.lang.NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass

2014-09-18 Thread Andy Davidson
After lots of hacking I figure out how to resolve this problem. This is good solution. It severalty cripples jackson but at least for now I am unblocked 1) turn off annotations. mapper.configure(Feature.USE_ANNOTATIONS, false); 2) in maven set the jackson dependencies as provided.

spark-1.1.0-bin-hadoop2.4 java.lang.NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass

2014-09-17 Thread Andy Davidson
Hi I am new to spark. I am trying to write a simple java program that process tweets that where collected and stored in a file. I figured the simplest thing to do would be to convert the JSON string into a java map. When I submit my jar file I keep getting the following error

how to report documentation bug?

2014-09-16 Thread Andy Davidson
http://spark.apache.org/docs/latest/quick-start.html#standalone-applications Click on java tab There is a bug in the maven section version1.1.0-SNAPSHOT/version Should be version1.1.0/version Hope this helps Andy

SparkSql newbie problems with nested selects

2014-07-13 Thread Andy Davidson
Hi I am running into trouble with a nested query using python. To try and debug it, I first wrote the query I want using sqlite3 select freq.docid, freqTranspose.docid, sum(freq.count * freqTranspose.count) from Frequency as freq, (select term, docid, count from Frequency) as

Re: SparkSql newbie problems with nested selects

2014-07-13 Thread Andy Davidson
(select term, docid, count from Frequency) freqTranspose where freq.term = freqTranspose.term group by freq.docid, freqTranspose.docid) Michael On Sun, Jul 13, 2014 at 12:43 PM, Andy Davidson a...@santacruzintegration.com wrote: Hi I am running into trouble

<    1   2   3