New to spark.

2016-09-28 Thread Anirudh Muhnot
Hello everyone, I'm Anirudh. I'm fairly new to spark as I've done an online specialisation from UC Berkeley. I know how to code in Python but have little to no idea about Scala. I want to contribute to Spark, Where do I start and how? I'm reading the pull requests at Git Hub

New to Spark

2015-12-01 Thread Ashok Kumar
Hi, I am new to Spark. I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables. I have successfully made Hive metastore to be used by Spark. In spark-sql I can see the DDL for Hive tables. However, when I do select count(1) from HIVE_TABLE it always returns zero rows. If I create

New to Spark

2015-12-01 Thread Ashok Kumar
Hi, I am new to Spark. I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables. I have successfully made Hive metastore to be used by Spark. In spark-sql I can see the DDL for Hive tables. However, when I do select count(1) from HIVE_TABLE it always returns zero rows. If I

Re: New to spark.

2016-09-28 Thread Bryan Cutler
=project%20%3D%20SPARK%20AND%20labels%20%3D%20Starter%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened) . On Wed, Sep 28, 2016 at 9:11 AM, Anirudh Muhnot wrote: > Hello everyone, I'm Anirudh. I'm fairly new to spark as I've done an > online specialisation from UC

Re: New to Spark

2015-12-01 Thread fightf...@163.com
hive config, that would help to locate root cause for the problem. Best, Sun. fightf...@163.com From: Ashok Kumar Date: 2015-12-01 18:54 To: user@spark.apache.org Subject: New to Spark Hi, I am new to Spark. I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables. I have

Re: New to Spark

2015-12-01 Thread Ted Yu
Have you tried the following command ? REFRESH TABLE Cheers On Tue, Dec 1, 2015 at 1:54 AM, Ashok Kumar wrote: > Hi, > > I am new to Spark. > > I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables. > > I have successfully made Hive metastore to be

New to Spark - Paritioning Question

2015-09-04 Thread mmike87
want to ensure that the RDD is partitioned by the Mine Identifier (and Integer). It's step 3 that is confusing me. I suspect it's very easy ... do I simply use PartitionByKey? We're using Java if that makes any difference. Thanks! -- View this message in context: http://apac

NEW to spark and sparksql

2014-11-19 Thread Sam Flint
Hi, I am new to spark. I have began to read to understand sparks RDD files as well as SparkSQL. My question is more on how to build out the RDD files and best practices. I have data that is broken down by hour into files on HDFS in avro format. Do I need to create a separate RDD for

Re: New to Spark - Paritioning Question

2015-09-08 Thread Richard Marscher
). > > It's step 3 that is confusing me. I suspect it's very easy ... do I simply > use PartitionByKey? > > We're using Java if that makes any difference. > > Thanks! > > > > -- >

Re: New to Spark - Paritioning Question

2015-09-08 Thread Mike Wright
t;> 3) I then want to ensure that the RDD is partitioned by the Mine >> Identifier >> (and Integer). >> >> It's step 3 that is confusing me. I suspect it's very easy ... do I simply >> use PartitionByKey? >> >> We're using Java if that

Re: New to Spark - Paritioning Question

2015-09-09 Thread Richard Marscher
t; >>> To ensure that this works, the idea if to: >>> >>> 1) Filter the superset to relevant mines (done) >>> 2) Group the subset by the unique identifier for the mine. So, a group >>> may >>> be

Re: NEW to spark and sparksql

2014-11-19 Thread Michael Armbrust
have been working on a library for Spark SQL. Its very early code, but you can find it here: https://github.com/databricks/spark-avro Bug reports welcome! Michael On Wed, Nov 19, 2014 at 1:02 PM, Sam Flint wrote: > Hi, > > I am new to spark. I have began to read to understand sparks

Re: NEW to spark and sparksql

2014-11-19 Thread Michael Armbrust
amming-guide.html >> >> For Avro in particular, I have been working on a library for Spark SQL. >> Its very early code, but you can find it here: >> https://github.com/databricks/spark-avro >> >> Bug reports welcome! >> >> Michael >> >> On

Re: NEW to spark and sparksql

2014-11-20 Thread Sam Flint
gest the programming >>> guides: >>> >>> http://spark.apache.org/docs/latest/quick-start.html >>> http://spark.apache.org/docs/latest/sql-programming-guide.html >>> >>> For Avro in particular, I have been working on a library for Spark SQL. >&

Re: NEW to spark and sparksql

2014-11-20 Thread Michael Armbrust
>>>> guides: >>>> >>>> http://spark.apache.org/docs/latest/quick-start.html >>>> http://spark.apache.org/docs/latest/sql-programming-guide.html >>>> >>>> For Avro in particular, I have been working on a library for Spark >>&

New to spark 2.2.1 - Problem with finding tables between different metastore db

2018-02-06 Thread Subhajit Purkayastha
All, I am new to Spark 2.2.1. I have a single node cluster and also have enabled thriftserver for my Tableau application to connect to my persisted table. I feel that the spark cluster metastore is different from the thrift-server metastore. If this assumption is valid, what do I need to

new to Spark - trying to get a basic example to run - could use some help

2016-02-12 Thread Taylor, Ronald C
Hello folks, This is my first msg to the list. New to Spark, and trying to run the SparkPi example shown in the Cloudera documentation. We have Cloudera 5.5.1 running on a small cluster at our lab, with Spark 1.5. My trial invocation is given below. The output that I get *says* that I

Re: new to Spark - trying to get a basic example to run - could use some help

2016-02-13 Thread Chandeep Singh
t; > This is my first msg to the list. New to Spark, and trying to run the > SparkPi example shown in the Cloudera documentation. We have Cloudera > 5.5.1 running on a small cluster at our lab, with Spark 1.5. > > My trial invocation is given below. The output that I get **says** that

Re: new to Spark - trying to get a basic example to run - could use some help

2016-02-13 Thread Ted Yu
; Log Type: stdout > > Log Upload Time: Sat Feb 13 11:00:08 + 2016 > > Log Length: 23 > > Pi is roughly 3.140224 > > > Hope that helps! > > > On Sat, Feb 13, 2016 at 3:14 AM, Taylor, Ronald C > wrote: > >> Hello folks, >> >> This is

I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all of

2015-09-09 Thread prachicsa
I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all of these token values. I tried the following way: val ECtokens = for (token <- listofECtok

Re: I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all o

2015-09-09 Thread Akhil Das
.contains(item)) found = true } found }).collect() Output: res8: Array[String] = Array(This contains EC-17A5206955089011B) Thanks Best Regards On Wed, Sep 9, 2015 at 3:25 PM, prachicsa wrote: > > > I am very new to Spark. > > I have a very basic question. I have an

Re: I am very new to Spark. I have a very basic question. I have an array of values: listofECtokens: Array[String] = Array(EC-17A5206955089011B, EC-17A5206955089011A) I want to filter an RDD for all o

2015-09-09 Thread Ted Yu
r(item <- tocks){ >if(line.contains(item)) found = true > } >found > }).collect() > > > Output: > res8: Array[String] = Array(This contains EC-17A5206955089011B) > > Thanks > Best Regards &g