date:20170203

Re: can I use Spark Standalone with HDFS but no YARN

2017-02-03 Thread kant kodali

I have 3 Spark Masters colocated with ZK's nodes and 2 Workers nodes. so my NameNodes are the same nodes as my spark master and DataNodes are the same Nodes as my Spark Workers. is that correct? How do I setup HDFS with zookeeper? On Fri, Feb 3, 2017 at 10:27 PM, Mark Hamstra wrote: > yes > > On

Re: can I use Spark Standalone with HDFS but no YARN

2017-02-03 Thread kant kodali

On Fri, Feb 3, 2017 at 10:27 PM, Mark Hamstra wrote: > yes > > On Fri, Feb 3, 2017 at 10:08 PM, kant kodali wrote: > >> can I use Spark Standalone with HDFS but no YARN? >> >> Thanks! >> > >

Re: can I use Spark Standalone with HDFS but no YARN

2017-02-03 Thread Mark Hamstra

yes On Fri, Feb 3, 2017 at 10:08 PM, kant kodali wrote: > can I use Spark Standalone with HDFS but no YARN? > > Thanks! >

can I use Spark Standalone with HDFS but no YARN

2017-02-03 Thread kant kodali

can I use Spark Standalone with HDFS but no YARN? Thanks!

Re: How do I dynamically add nodes to spark standalone cluster and be able to discover them?

2017-02-03 Thread kant kodali

sorry I should just do this ./start-slave.sh spark://x.x.x.x:7077,y.y.y.y:7077,z.z.z.z:7077 but what about export SPARK_MASTER_HOST="x.x.x.x y.y.y.y z.z.z.z" ? Dont I need to have that on my worker node? Thanks! On Fri, Feb 3, 2017 at 4:57 PM, kant kodali wrote: > Hi, > > How do I start a

Re: How do I dynamically add nodes to spark standalone cluster and be able to discover them?

2017-02-03 Thread kant kodali

Hi, How do I start a slave? just run start-slave.sh script? but then I don't understand the following. I put the following in spark-env.sh in the worker machine export SPARK_MASTER_HOST="x.x.x.x y.y.y.y z.z.z.z" but start-slave.sh doesn't seem to take SPARK_MASTER_HOST env variable. so I did th

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Shashank Mandil

I may have found my problem. We have a scala wrapper on top of spark-submit to run the shell command through scala. We were kind of eating the exit code from spark-submit in that wrapper. When I looked at what the actual exit code was stripping away the wrapper I got 1. So I think spark-submit is

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Hollin Wilkins

Hey Asher, A phone call may be the best to discuss all of this. But in short: 1. It is quite easy to add custom pipelines/models to MLeap. All of our out-of-the-box transformers can serve as a good example of how to do this. We are also putting together documentation on how to do this in our docs

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Jacek Laskowski

Hi, ➜ spark git:(master) ✗ ./bin/spark-submit whatever || echo $? Error: Cannot load main class from JAR file:/Users/jacek/dev/oss/spark/whatever Run with --help for usage help or --verbose for debug output 1 I see 1 and there are other cases for 1 too. Pozdrawiam, Jacek Laskowski https://

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Ali Gouta

Hello, +1, i have exactly the same issue. I need the exit code to make a decision on oozie executing actions. Spark-submit always returns 0 when catching the exception. From spark 1.5 to 1.6.x, i still have the same issue... It would be great to fix it or to know if there is some work around about

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Jacek Laskowski

Hi, An interesting case. You don't use Spark resources whatsoever. Creating a SparkConf does not use YARN...yet. I think any run mode would have the same effect. So, although spark-submit could have returned exit code 1, the use case touches Spark very little. What version is that? Do you see "Th

Re: sqlContext vs spark.

2017-02-03 Thread Jacek Laskowski

Hi, Yes. Forget about SQLContext. It's been merged into SparkSession as of Spark 2.0 (same about HiveContext). Long live SparkSession! :-) Jacek On 3 Feb 2017 7:48 p.m., "☼ R Nair (रविशंकर नायर)" < ravishankar.n...@gmail.com> wrote: All, In Spark 1.6.0, we used val jdbcDF = sqlContext.read.

Re: HBase Spark

2017-02-03 Thread Benjamin Kim

Asher, I found a profile for Spark 2.11 and removed it. Now, it brings in 2.10. I ran some code and got further. Now, I get this error below when I do a “df.show”. java.lang.AbstractMethodError at org.apache.spark.Logging$class.log(Logging.scala:50) at org.apache.spark.sql.execu

Re: HBase Spark

2017-02-03 Thread Asher Krim

You can see in the tree what's pulling in 2.11. Your option then will be to either shade them and add an explicit dependency on 2.10.5 in your pom. Alternatively, you can explore upgrading your project to 2.11 (which will require using a 2_11 build of spark) On Fri, Feb 3, 2017 at 2:03 PM, Benjam

Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Shashank Mandil

Hi All, I wrote a test script which always throws an exception as below : object Test { def main(args: Array[String]) { try { val conf = new SparkConf() .setAppName("Test") throw new RuntimeException("Some Exception") println("all done!") } catch

Re: HBase Spark

2017-02-03 Thread Benjamin Kim

Asher, You’re right. I don’t see anything but 2.11 being pulled in. Do you know where I can change this? Cheers, Ben > On Feb 3, 2017, at 10:50 AM, Asher Krim wrote: > > Sorry for my persistence, but did you actually run "mvn dependency:tree > -Dverbose=true"? And did you see only scala 2.1

Re: NoNodeAvailableException (None of the configured nodes are available) error when trying to push data to Elastic from a Spark job

2017-02-03 Thread Anastasios Zouzias

Hi there, Are you sure that the cluster nodes where the executors run have network connectivity to the elastic cluster? Speaking of which, why don't you use: https://github.com/elastic/elasticsearch-hadoop#apache-spark ? Cheers, Anastasios On Fri, Feb 3, 2017 at 7:10 PM, Dmitry Goldenberg wrot

Re: HBase Spark

2017-02-03 Thread Asher Krim

Sorry for my persistence, but did you actually run "mvn dependency:tree -Dverbose=true"? And did you see only scala 2.10.5 being pulled in? On Fri, Feb 3, 2017 at 12:33 PM, Benjamin Kim wrote: > Asher, > > It’s still the same. Do you have any other ideas? > > Cheers, > Ben > > > On Feb 3, 2017,

sqlContext vs spark.

2017-02-03 Thread रविशंकर नायर

All, In Spark 1.6.0, we used val jdbcDF = sqlContext.read.format(-) for creating a data frame through hsbc. In Spark 2.1.x, we have seen this is val jdbcDF = *spark*.read.format(-) Does that mean we should not be using sqlContext going forward? Also, we see that sqlContext is not auto

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Asher Krim

I have a bunch of questions for you Hollin: How easy is it to add support for custom pipelines/models? Are Spark mllib models supported? We currently run spark in local mode in an api service. It's not super terrible, but performance is a constant struggle. Have you benchmarked any performance dif

Re: saveToCassandra issue. Please help

2017-02-03 Thread shyla deshpande

Thanks Fernando. But I need to have only 1 row for a given user, date with very low latency. So none of your options work for me. On Fri, Feb 3, 2017 at 10:34 AM, Fernando Avalos wrote: > Hi Shyla, > > Maybe I am wrong, but I can see two options here. > > 1.- Do some grouping before insert to

Re: saveToCassandra issue. Please help

2017-02-03 Thread shyla deshpande

Hi All, I wanted to add more info .. The first column is the user and the third is the period. and my key is (userid, date) For a given user and date combination I want to see only 1 row. My problem is that PT0H10M0S is overwritten by PT0H9M30S, even though the order of the rows in the RDD is PT0H

NoNodeAvailableException (None of the configured nodes are available) error when trying to push data to Elastic from a Spark job

2017-02-03 Thread Dmitry Goldenberg

Hi, Any reason why we might be getting this error? The code seems to work fine in the non-distributed mode but the same code when run from a Spark job is not able to get to Elastic. Spark version: 2.0.1 built for Hadoop 2.4, Scala 2.11 Elastic version: 2.3.1 I've verified the Elastic hosts and

Re: HBase Spark

2017-02-03 Thread Benjamin Kim

Asher, It’s still the same. Do you have any other ideas? Cheers, Ben > On Feb 3, 2017, at 8:16 AM, Asher Krim wrote: > > Did you check the actual maven dep tree? Something might be pulling in a > different version. Also, if you're seeing this locally, you might want to > check which version

Re: HBase Spark

2017-02-03 Thread Benjamin Kim

I'll clean up any .m2 or .ivy directories. And try again. I ran this on our lab cluster for testing. Cheers, Ben On Fri, Feb 3, 2017 at 8:16 AM Asher Krim wrote: > Did you check the actual maven dep tree? Something might be pulling in a > different version. Also, if you're seeing this locally

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Hollin Wilkins

Hey Aseem, We have built pipelines that execute several string indexers, one hot encoders, scaling, and a random forest or linear regression at the end. Execution time for the linear regression was on the order of 11 microseconds, a bit longer for random forest. This can be further optimized by us

Re: HBase Spark

2017-02-03 Thread Asher Krim

Did you check the actual maven dep tree? Something might be pulling in a different version. Also, if you're seeing this locally, you might want to check which version of the scala sdk your IDE is using Asher Krim Senior Software Engineer On Thu, Feb 2, 2017 at 5:43 PM, Benjamin Kim wrote: > Hi

Is DoubleWritable and DoubleObjectInspector doing the same thing in Hive UDF?

2017-02-03 Thread Alex

Hi, can You guys tell me if below peice of two codes are returning the same thing? (((DoubleObjectInspector) ins2).get(obj)); and (DoubleWritable)obj).get(); from below two codes code 1) public Object get(Object name) { int pos = getPos((String)name); if(pos<0) return null; Stri

problem with the method JavaDStream.foreachRDD() SparkStreaming

2017-02-03 Thread Hamza HACHANI

Hi, I'm new to SparkStreaming. I'm using the versions 2.10 for spark core and spark streaming My issue is that when i try to use JavaPairDStream.foreachRDD : test.foreachRDD(new Function,Void>() { public Void call(JavaPairRDD rdd) { currentResponseCodeCounts =

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Aseem Bansal

Does this support Java 7? On Fri, Feb 3, 2017 at 5:30 PM, Aseem Bansal wrote: > Is computational time for predictions on the order of few milliseconds (< > 10 ms) like the old mllib library? > > On Thu, Feb 2, 2017 at 10:12 PM, Hollin Wilkins wrote: > >> Hey everyone, >> >> >> Some of you may h

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Aseem Bansal

Is computational time for predictions on the order of few milliseconds (< 10 ms) like the old mllib library? On Thu, Feb 2, 2017 at 10:12 PM, Hollin Wilkins wrote: > Hey everyone, > > > Some of you may have seen Mikhail and I talk at Spark/Hadoop Summits about > MLeap and how you can use it to b

Bipartite projection with Graphx

2017-02-03 Thread balaji9058

Hi, Is possible Bipartite projection with Graphx Rdd1 #id name 1 x1 2 x2 3 x3 4 x4 5 x5 6 x6 7 x7 8 x8 Rdd2 #id name 10001 y1 10002 y2 10003 y3 10004 y4 10005 y5 10006 y6 EdgeList #src id Dest id 1 10001 1 10002 2

Re: Suprised!!!!!Spark-shell showing inconsistent results

2017-02-03 Thread Alex

Hi Team, Actually I figured out something .. While Hive Java UDF executed on hive it is giving output with 10 decimal precision but in spark same udf is giving results rounded off to 6 decimal precision... How do I stop that? Its the same java udf jar files used in both hive and spark.. [image:

saveToCassandra issue. Please help

2017-02-03 Thread shyla deshpande

Hello All, This is the content of my RDD which I am saving to Cassandra table. But looks like the 2nd row is written first and then the first row overwrites it. So I end up with bad output. (494bce4f393b474980290b8d1b6ebef9, 2017-02-01, PT0H9M30S, WEDNESDAY) (494bce4f393b474980290b8d1b6ebef9, 20

Re: can I use Spark Standalone with HDFS but no YARN

Re: can I use Spark Standalone with HDFS but no YARN

Re: can I use Spark Standalone with HDFS but no YARN

can I use Spark Standalone with HDFS but no YARN

Re: How do I dynamically add nodes to spark standalone cluster and be able to discover them?

Re: How do I dynamically add nodes to spark standalone cluster and be able to discover them?

Re: Spark submit on yarn does not return with exit code 1 on exception

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

Re: Spark submit on yarn does not return with exit code 1 on exception

Re: Spark submit on yarn does not return with exit code 1 on exception

Re: Spark submit on yarn does not return with exit code 1 on exception

Re: sqlContext vs spark.

Re: HBase Spark

Re: HBase Spark

Spark submit on yarn does not return with exit code 1 on exception

Re: HBase Spark

Re: NoNodeAvailableException (None of the configured nodes are available) error when trying to push data to Elastic from a Spark job

Re: HBase Spark

sqlContext vs spark.

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

Re: saveToCassandra issue. Please help

Re: saveToCassandra issue. Please help

NoNodeAvailableException (None of the configured nodes are available) error when trying to push data to Elastic from a Spark job

Re: HBase Spark

Re: HBase Spark

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

Re: HBase Spark

Is DoubleWritable and DoubleObjectInspector doing the same thing in Hive UDF?

problem with the method JavaDStream.foreachRDD() SparkStreaming

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

Bipartite projection with Graphx

Re: Suprised!!!!!Spark-shell showing inconsistent results

saveToCassandra issue. Please help

34 matches

Site Navigation

Mail list logo

Footer information