Reading from cassandra store in rdd

2016-05-04 Thread Yasemin Kaya
Hi, I asked this question datastax group but i want to ask also spark-user group, someone may face this problem. I have a data in Cassandra and want to get data to SparkRDD. I got an error , searched it but nothing changed. Is there anyone can help me to fix it? I can connect Cassandra and cqlsh

Re: Saving model S3

2016-03-21 Thread Yasemin Kaya
Hi Ted, I don't understand the issue that you want to learn? Could you be more clear please? 2016-03-21 15:24 GMT+02:00 Ted Yu <yuzhih...@gmail.com>: > Was speculative execution enabled ? > > Thanks > > On Mar 21, 2016, at 6:19 AM, Yasemin Kaya <godo...@gmail.com&

Saving model S3

2016-03-21 Thread Yasemin Kaya
Hi, I am using S3 read data also I want to save my model S3. In reading part there is no error, but when I save model I am getting this error . I tried to change the way from s3n to s3a but nothing change, different errors comes. *reading

Re: reading file from S3

2016-03-16 Thread Yasemin Kaya
ofcourse something >>> that Amazon strongly suggests that we do not use. Please use roles and you >>> will not have to worry about security. >>> >>> Regards, >>> Gourav Sengupta >>> >>> On Tue, Mar 15, 2016 at 2:38 PM, Sabarish Sasidharan

Re: reading file from S3

2016-03-15 Thread Yasemin Kaya
t; Safak. > > 2016-03-15 12:33 GMT+02:00 Yasemin Kaya <godo...@gmail.com>: > >> Hi, >> >> I am using Spark 1.6.0 standalone and I want to read a txt file from S3 >> bucket named yasemindeneme and my file name is deneme.txt. But I am getting >> this er

reading file from S3

2016-03-15 Thread Yasemin Kaya
Hi, I am using Spark 1.6.0 standalone and I want to read a txt file from S3 bucket named yasemindeneme and my file name is deneme.txt. But I am getting this error. Here is the simple code Exception in thread "main"

concurrent.RejectedExecutionException

2016-01-23 Thread Yasemin Kaya
Hi all, I'm using spark 1.5 and getting this error. Could you help i cant understand? 16/01/23 10:11:59 ERROR TaskSchedulerImpl: Exception in statusUpdate java.util.concurrent.RejectedExecutionException: Task org.apache.spark.scheduler.TaskResultGetter$$anon$2@62c72719 rejected from

Re: write new data to mysql

2016-01-08 Thread Yasemin Kaya
dbc(MYSQL_CONNECTION_URL_WRITE, > "track_on_alarm", connectionProps) > > HTH. > > -Todd > > On Fri, Jan 8, 2016 at 10:53 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Which Spark release are you using ? >> >> For case #2, was there any error / clue in the l

Re: write new data to mysql

2016-01-08 Thread Yasemin Kaya
When i change the version to 1.6.0, it worked. Thanks. 2016-01-08 21:27 GMT+02:00 Yasemin Kaya <godo...@gmail.com>: > Hi, > There is no write function that Todd mentioned or i cant find it. > The code and error are in gist > <https://gist.github.com/yaseminn/f5a2b78b126

write new data to mysql

2016-01-08 Thread Yasemin Kaya
Hi, I want to write dataframe existing mysql table, but when i use *peopleDataFrame.insertIntoJDBC(MYSQL_CONNECTION_URL_WRITE, "track_on_alarm",false)* it says "Table track_on_alarm already exists." And when i *use peopleDataFrame.insertIntoJDBC(MYSQL_CONNECTION_URL_WRITE,

Struggling time by data

2015-12-25 Thread Yasemin Kaya
hi, I have struggled this data couple of days, i cant find solution. Could you help me? *DATA:* *(userid1_time, url) * *(userid1_time2, url2)* I want to get url which are in 30 min. *RESULT:* *If time2-time1<30 min* *(user1, [url1, url2] )* Best, yasemin -- hiç ender hiç

Re: Struggling time by data

2015-12-25 Thread Yasemin Kaya
s(0), (s(1), > y)))}.groupByKey().filter{case (_, (a, b)) => abs(a._1, a._1) < 30min} > > does it work for you ? > > 2015-12-25 16:53 GMT+08:00 Yasemin Kaya <godo...@gmail.com>: > >> hi, >> >> I have struggled this data couple of days, i

rdd split into new rdd

2015-12-23 Thread Yasemin Kaya
Hi, I have data *JavaPairRDD> *format. In example: *(1610, {a=1, b=1, c=2, d=2}) * I want to get *JavaPairRDD* In example: *(1610, {a, b})* *(1610, {c, d})* Is there a way to solve this problem? Best, yasemin -- hiç ender hiç

Re: rdd split into new rdd

2015-12-23 Thread Yasemin Kaya
, b=1, c=2, d=2} >> >> Can you elaborate your criteria a bit more ? The above seems to be a Set, >> not a Map. >> >> Cheers >> >> On Wed, Dec 23, 2015 at 7:11 AM, Yasemin Kaya <godo...@gmail.com> wrote: >> >>> Hi, >>> >>>

groupByKey()

2015-12-08 Thread Yasemin Kaya
Hi, Sorry for the long inputs but it is my situation. i have two list and i wana grupbykey them but some value of list disapear. i can't understand this. (8867989628612931721,[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

rdd conversion

2015-10-26 Thread Yasemin Kaya
Hi, I have *JavaRDD>>* and I want to convert every map to pairrdd, i mean * JavaPairRDD. * There is a loop in list to get the indexed map, when I write code below, it returns me only one rdd. JavaPairRDD mapToRDD =

Re: rdd conversion

2015-10-26 Thread Yasemin Kaya
hat was why you got one RDD. > > On Mon, Oct 26, 2015 at 9:40 AM, Yasemin Kaya <godo...@gmail.com> wrote: > >> Hi, >> >> I have *JavaRDD<List<Map<Integer, ArrayList>>>* and I want to >> convert every map to pairrdd, i mean >> * JavaPairRD

Model exports PMML (Random Forest)

2015-10-07 Thread Yasemin Kaya
Hi, I want to export my model to PMML. But there is no development about random forest. It is planned to 1.6 version. Is it possible producing my model (random forest) PMML xml format manuelly? Thanks. Best, yasemin -- hiç ender hiç

ML Pipeline

2015-09-28 Thread Yasemin Kaya
Hi, I am using Spar 1.5 and ML Pipeline. I create the model then give the model unlabeled data to find the probabilites and predictions. When I want to see the results, it returns me error. //creating model final PipelineModel model = pipeline.fit(trainingData); JavaRDD rowRDD1 = unlabeledTest

Re: spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
ort sqlContext.implicits._" and then call > "rdd.toDf()" on your RDD to convert it into a dataframe. > > On Fri, Sep 18, 2015 at 7:32 AM, Yasemin Kaya <godo...@gmail.com> wrote: > >> Hi, >> >> I am using *spark 1.5, ML Pipeline Decision Tree >> <ht

spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
Hi, I am using *spark 1.5, ML Pipeline Decision Tree * to get tree's probability. But I have to convert my data to Dataframe type. While creating model there is no problem but when I am using model on my data there is a

Re: Random Forest MLlib

2015-09-15 Thread Yasemin Kaya
Hi Maximo, Is there a way getting precision and recall from pipeline? In MLlib version I get precision and recall metrics from MulticlassMetrics but ML pipeLine says only testErr. Thanks yasemin 2015-09-10 17:47 GMT+03:00 Yasemin Kaya <godo...@gmail.com>: > Hi Maximo, > Thanks

Multilabel classification support

2015-09-11 Thread Yasemin Kaya
Hi, I want to use Mllib for multilabel classification, but I find http://spark.apache.org/docs/latest/mllib-classification-regression.html, it is not what I mean. Is there a way to use multilabel classification? Thanks alot. Best, yasemin -- hiç ender hiç

Random Forest MLlib

2015-09-10 Thread Yasemin Kaya
Hi , I am using Random Forest Alg. for recommendation system. I get users and users' response yes or no (1/0). But I want to learn the probability of the trees. Program says x user yes but with how much probability, I want to get these probabilities. Best, yasemin -- hiç ender hiç

Re: Random Forest MLlib

2015-09-10 Thread Yasemin Kaya
Hi Maximo, Thanks alot.. Hi Yasemin, We had the same question and found this: https://issues.apache.org/jira/browse/SPARK-6884 Thanks, Maximo On Sep 10, 2015, at 9:09 AM, Yasemin Kaya <godo...@gmail.com> wrote: Hi , I am using Random Forest Alg. for recommendation system. I get

Re: EC2 cluster doesn't work saveAsTextFile

2015-08-10 Thread Yasemin Kaya
http://polyglotprogramming.com On Mon, Aug 10, 2015 at 7:08 AM, Yasemin Kaya godo...@gmail.com wrote: Hi, I have EC2 cluster, and am using spark 1.3, yarn and HDFS . When i submit at local there is no problem , but i run at cluster, saveAsTextFile doesn't work.*It says me User class threw

EC2 cluster doesn't work saveAsTextFile

2015-08-10 Thread Yasemin Kaya
Hi, I have EC2 cluster, and am using spark 1.3, yarn and HDFS . When i submit at local there is no problem , but i run at cluster, saveAsTextFile doesn't work.*It says me User class threw exception: Output directory hdfs://172.31.42.10:54310/./weblogReadResult

java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Hi, I have a little spark program and i am getting an error why i dont understand. My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3. I am using spark 1.3 Submitting : bin/spark-submit --class MonthlyAverage --master local[4] weather.jar error: ~/spark-1.3.1-bin-hadoop2.4$

Re: java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Thanx Ted, i solved it :) 2015-08-08 14:07 GMT+03:00 Ted Yu yuzhih...@gmail.com: Have you tried including package name in the class name ? Thanks On Aug 8, 2015, at 12:00 AM, Yasemin Kaya godo...@gmail.com wrote: Hi, I have a little spark program and i am getting an error why i dont

Re: Amazon DynamoDB Spark

2015-08-07 Thread Yasemin Kaya
Thanx Jay. 2015-08-07 19:25 GMT+03:00 Jay Vyas jayunit100.apa...@gmail.com: In general the simplest way is that you can use the Dynamo Java API as is and call it inside a map(), and use the asynchronous put() Dynamo api call . On Aug 7, 2015, at 9:08 AM, Yasemin Kaya godo...@gmail.com

Amazon DynamoDB Spark

2015-08-07 Thread Yasemin Kaya
Hi, Is there a way using DynamoDB in spark application? I have to persist my results to DynamoDB. Thanx, yasemin -- hiç ender hiç

Broadcast value

2015-06-12 Thread Yasemin Kaya
Hi, I am taking Broadcast value from file. I want to use it creating Rating Object (ALS) . But I am getting null. Here is my code https://gist.github.com/yaseminn/d6afd0263f6db6ea4ec5 : At lines 17 18 is ok but 19 returns null so 21 returns me error. Why I don't understand.Do you have any idea

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
I couldn't find any solution. I can write but I can't read from Cassandra. 2015-06-09 8:52 GMT+03:00 Yasemin Kaya godo...@gmail.com: Thanks alot Mohammed, Gerard and Yana. I can write to table, but exception returns me. It says *Exception in thread main java.io.IOException: Failed to open

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
an out-of-the box cassandra conf where rpc_address: localhost # port for Thrift to listen for clients on rpc_port: 9160 On Tue, Jun 9, 2015 at 7:36 AM, Yasemin Kaya godo...@gmail.com wrote: I couldn't find any solution. I can write but I can't read from Cassandra. 2015-06-09 8:52 GMT

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
Sorry my answer I hit terminal lsof -i:9160: result is lsof -i:9160 COMMAND PIDUSER FD TYPE DEVICE SIZE/OFF NODE NAME java7597 inosens 101u IPv4 85754 0t0 TCP localhost:9160 (LISTEN) so 9160 port is available or not ? 2015-06-09 17:16 GMT+03:00 Yasemin Kaya godo

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
, 2015 at 10:18 AM, Yasemin Kaya godo...@gmail.com wrote: Sorry my answer I hit terminal lsof -i:9160: result is lsof -i:9160 COMMAND PIDUSER FD TYPE DEVICE SIZE/OFF NODE NAME java7597 inosens 101u IPv4 85754 0t0 TCP localhost:9160 (LISTEN) so 9160 port is available

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
localhost -p 9160 Connected to: Test Cluster on localhost/9160 Welcome to Cassandra CLI version 2.1.5 On Tue, Jun 9, 2015 at 1:29 PM, Yasemin Kaya godo...@gmail.com wrote: My jar files are: cassandra-driver-core-2.1.5.jar cassandra-thrift-2.1.3.jar guava-18.jar jsr166e-1.1.0.jar spark

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
Kaya godo...@gmail.com wrote: Hi, I run my project on local. How can find ip address of my cassandra host ? From cassandra.yaml or ?? yasemin 2015-06-08 11:27 GMT+03:00 Gerard Maas gerard.m...@gmail.com: ? = ip address of your cassandra host On Mon, Jun 8, 2015 at 10:12 AM, Yasemin Kaya

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
Cassandra nodes. Mohammed *From:* Yasemin Kaya [mailto:godo...@gmail.com] *Sent:* Friday, June 5, 2015 7:31 AM *To:* user@spark.apache.org *Subject:* Cassandra Submit Hi, I am using cassandraDB in my project. I had that error *Exception in thread main java.io.IOException: Failed

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
Hi, I run my project on local. How can find ip address of my cassandra host ? From cassandra.yaml or ?? yasemin 2015-06-08 11:27 GMT+03:00 Gerard Maas gerard.m...@gmail.com: ? = ip address of your cassandra host On Mon, Jun 8, 2015 at 10:12 AM, Yasemin Kaya godo...@gmail.com wrote: Hi

Cassandra Submit

2015-06-05 Thread Yasemin Kaya
Hi, I am using cassandraDB in my project. I had that error *Exception in thread main java.io.IOException: Failed to open native connection to Cassandra at {127.0.1.1}:9042* I think I have to modify the submit line. What should I add or remove when I submit my project? Best, yasemin -- hiç

ALS Rating Object

2015-06-03 Thread Yasemin Kaya
Hi, I want to use Spark's ALS in my project. I have the userid like 30011397223227125563254 and Rating Object which is the Object of ALS wants Integer as a userid so the id field does not fit into a 32 bit Integer. How can I solve that ? Thanks. Best, yasemin -- hiç ender hiç

Re: ALS Rating Object

2015-06-03 Thread Yasemin Kaya
: spark.ml.recommendation.ALS (in the Pipeline API) exposes ALS.train as a DeveloperApi to allow users to use Long instead of Int. We're also thinking about better ways to permit Long IDs. Joseph On Wed, Jun 3, 2015 at 5:04 AM, Yasemin Kaya godo...@gmail.com wrote: Hi, I want to use Spark's ALS in my project

Cassanda example

2015-06-01 Thread Yasemin Kaya
Hi, I want to write my RDD to Cassandra database and I took an example from this site http://www.datastax.com/dev/blog/accessing-cassandra-from-spark-in-java. I add that to my project but I have errors. Here is my project in gist https://gist.github.com/yaseminn/aba86dad9a3e6d6a03dc. errors :

Collabrative Filtering

2015-05-26 Thread Yasemin Kaya
Hi, In CF String path = data/mllib/als/test.data; JavaRDDString data = sc.textFile(path); JavaRDDRating ratings = data.map(new FunctionString, Rating() { public Rating call(String s) { String[] sarray = s.split(,); return new Rating(Integer.parseInt(sarray[0]), Integer .parseInt(sarray[1]),

map reduce ?

2015-05-21 Thread Yasemin Kaya
Hi, I have JavaPairRDDString, ListInteger and as an example what I want to get. user_id cat1 cat2 cat3 cat4 522 0 1 2 0 62 1 0 3 0 661 1 2 0 1 query : the users who have a number (except 0) in cat1 and cat3 column answer: cat2 - 522,611 cat3-522,62 = user 522 How can I

reduceByKey

2015-05-14 Thread Yasemin Kaya
Hi, I have JavaPairRDDString, String and I want to implement reduceByKey method. My pairRDD : *2553: 0,0,0,1,0,0,0,0* 46551: 0,1,0,0,0,0,0,0 266: 0,1,0,0,0,0,0,0 *2553: 0,0,0,0,0,1,0,0* *225546: 0,0,0,0,0,1,0,0* *225546: 0,0,0,0,0,1,0,0* I want to get : *2553: 0,0,0,1,0,1,0,0* 46551:

Re: swap tuple

2015-05-14 Thread Yasemin Kaya
[evo.efti...@isecc.com] *Sent:* Thursday, May 14, 2015 1:24 PM *To:* 'Holden Karau'; 'Yasemin Kaya' *Cc:* user@spark.apache.org *Subject:* RE: swap tuple Where is the “Tuple” supposed to be in String, String - you can refer to a “Tuple” if it was e.g. String, Tuple2String, String *From

swap tuple

2015-05-14 Thread Yasemin Kaya
Hi, I have *JavaPairRDDString, String *and I want to *swap tuple._1() to tuple._2()*. I use *tuple.swap() *but it can't be changed JavaPairRDD in real. When I print JavaPairRDD, the values are same. Anyone can help me for that? Thank you. Have nice day. yasemin -- hiç ender hiç

JavaPairRDD

2015-05-13 Thread Yasemin Kaya
Hi, I want to get *JavaPairRDDString, String *from the tuple part of *JavaPairRDDString, Tuple2String, String .* As an example: ( http://www.koctas.com.tr/reyon/el-aletleri/7,(0,1,0,0,0,0,0,0,46551)) in my *JavaPairRDDString, Tuple2String, String *and I want to get *( (46551),

Re: JavaPairRDD

2015-05-13 Thread Yasemin Kaya
://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html Tristan On 13 May 2015 at 23:12, Yasemin Kaya godo...@gmail.com wrote: Hi, I want to get *JavaPairRDDString, String *from the tuple part of *JavaPairRDDString, Tuple2String, String .* As an example: ( http

Content based filtering

2015-05-12 Thread Yasemin Kaya
Hi, is Content based filtering available for Spark in Mllib? If it isn't , what can I use as an alternative? Thank you. Have a nice day yasemin -- hiç ender hiç

Spark Mongodb connection

2015-05-04 Thread Yasemin Kaya
Hi! I am new at Spark and I want to begin Spark with simple wordCount example in Java. But I want to give my input from Mongodb database. I want to learn how can I connect Mongodb database to my project. Any one can help for this issue. Have a nice day yasemin -- hiç ender hiç