Re: Max number of columns

2017-06-16 Thread Jan Holmberg
Ups, wrong user list. Sorry. :-) > On 16 Jun 2017, at 10.44, Jan Holmberg <jan.holmb...@perigeum.fi> wrote: > > Hi, > I ran into Kudu limitation of max columns (300). Same limit seemed to apply > latest Kudu version as well but not ex. Impala/Hive (in the sa

Max number of columns

2017-06-16 Thread Jan Holmberg
Hi, I ran into Kudu limitation of max columns (300). Same limit seemed to apply latest Kudu version as well but not ex. Impala/Hive (in the same extent at least). * is this limitation going to be loosened in near future? * any suggestions how to get over this limitation? Table splitting is the

Re: Stress testing hdfs with Spark

2016-04-05 Thread Jan Holmberg
ms Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 5 April 2016 at 20:56, Jan Holmberg <jan.holmb...@perigeum.fi<mailto:jan.holmb...@

Re: Stress testing hdfs with Spark

2016-04-05 Thread Jan Holmberg
a for the IO monitoring On Tue, 5 Apr 2016, 20:56 Jan Holmberg, <jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote: I'm trying to get rough estimate how much data I can write within certain time period (GB/sec). -jan On 05 Apr 2016, at 22:49, Mich Talebzadeh &l

Re: Stress testing hdfs with Spark

2016-04-05 Thread Jan Holmberg
the matrices? Throughput of data, latency, velocity, volume? HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 5 April 2016 at 20:42, Jan Ho

Stress testing hdfs with Spark

2016-04-05 Thread Jan Holmberg
Hi, I'm trying to figure out how to write lots of data from each worker. I tried rdd.saveAsTextFile but got OOM when generating 1024MB string for a worker. Increasing worker memory would mean that I should drop the number of workers. Soo, any idea how to write ex. 1gb file from each worker?

Long running jobs in CDH

2016-01-11 Thread Jan Holmberg
Hi, any preferences how to run constantly running jobs (streaming) in CDH? Oozie? Cmdline? Something else? cheers, -jan - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail:

Re: Writing partitioned Avro data to HDFS

2015-12-22 Thread Jan Holmberg
orporate it. Probably you can file a jira and experienced contributors can share their thoughts. 1. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala Line- 131 - Thanks, via mobile, excuse brevity. On D

Writing partitioned Avro data to HDFS

2015-12-22 Thread Jan Holmberg
Hi, I'm stuck with writing partitioned data to hdfs. Example below ends up with 'already exists' -error. I'm wondering how to handle streaming use case. What is the intended way to write streaming data to hdfs? What am I missing? cheers, -jan import com.databricks.spark.avro._ import

Re: Writing partitioned Avro data to HDFS

2015-12-22 Thread Jan Holmberg
should give you a distinct path for every run. Let us know if it helps or if i missed anything. Goodluck - Thanks, via mobile, excuse brevity. On Dec 22, 2015 2:31 PM, "Jan Holmberg" <jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote: Hi, I'm stuck with wri

Re: Writing partitioned Avro data to HDFS

2015-12-22 Thread Jan Holmberg
p/data/year/month/02 Or, /tmp/data/01/year/month /tmp/data/02/year/month This is a work around. Am sure other better approaches would follow. - Thanks, via mobile, excuse brevity. On Dec 22, 2015 7:01 PM, "Jan Holmberg" <jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum

Status stays at ACCEPTED

2014-05-20 Thread Jan Holmberg
Hi, I’m new to Spark and trying to test first Spark prog. I’m running SparkPi successfully in yarn-client -mode but when running the same in yarn-mode, app gets stuck to ACCEPTED phase. I’ve tried hours to hunt down the reason but the outcome is always the same. Any hints what to look for next?

Re: Status stays at ACCEPTED

2014-05-20 Thread Jan Holmberg
node? If you go to the ResourceManager web UI, does it indicate any containers are running? -Sandy On May 19, 2014, at 11:43 PM, Jan Holmberg jan.holmb...@perigeum.fi wrote: Hi, I’m new to Spark and trying to test first Spark prog. I’m running SparkPi successfully in yarn-client

Re: Status stays at ACCEPTED

2014-05-20 Thread Jan Holmberg
that are fired when I start the Spark run : Zookeeper : caught end of stream exception Yarn : The specific max attempts: 0 for application: 1 is invalid, because it is out of the range [1, 2]. Use the global max attempts -jan On 20 May 2014, at 11:14, Jan Holmberg jan.holmb