Ups, wrong user list. Sorry. :-)
> On 16 Jun 2017, at 10.44, Jan Holmberg <jan.holmb...@perigeum.fi> wrote:
>
> Hi,
> I ran into Kudu limitation of max columns (300). Same limit seemed to apply
> latest Kudu version as well but not ex. Impala/Hive (in the sa
Hi,
I ran into Kudu limitation of max columns (300). Same limit seemed to apply
latest Kudu version as well but not ex. Impala/Hive (in the same extent at
least).
* is this limitation going to be loosened in near future?
* any suggestions how to get over this limitation? Table splitting is the
ms
Dr Mich Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>
On 5 April 2016 at 20:56, Jan Holmberg
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@
a for the IO monitoring
On Tue, 5 Apr 2016, 20:56 Jan Holmberg,
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote:
I'm trying to get rough estimate how much data I can write within certain time
period (GB/sec).
-jan
On 05 Apr 2016, at 22:49, Mich Talebzadeh
&l
the matrices?
Throughput of data, latency, velocity, volume?
HTH
Dr Mich Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>
On 5 April 2016 at 20:42, Jan Ho
Hi,
I'm trying to figure out how to write lots of data from each worker. I tried
rdd.saveAsTextFile but got OOM when generating 1024MB string for a worker.
Increasing worker memory would mean that I should drop the number of workers.
Soo, any idea how to write ex. 1gb file from each worker?
Hi,
any preferences how to run constantly running jobs (streaming) in CDH? Oozie?
Cmdline? Something else?
cheers,
-jan
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail:
orporate it. Probably you can file a
jira and experienced contributors can share their thoughts.
1.
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala
Line- 131
- Thanks, via mobile, excuse brevity.
On D
Hi,
I'm stuck with writing partitioned data to hdfs. Example below ends up with
'already exists' -error.
I'm wondering how to handle streaming use case.
What is the intended way to write streaming data to hdfs? What am I missing?
cheers,
-jan
import com.databricks.spark.avro._
import
should give you a distinct path for every run.
Let us know if it helps or if i missed anything.
Goodluck
- Thanks, via mobile, excuse brevity.
On Dec 22, 2015 2:31 PM, "Jan Holmberg"
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote:
Hi,
I'm stuck with wri
p/data/year/month/02
Or,
/tmp/data/01/year/month
/tmp/data/02/year/month
This is a work around.
Am sure other better approaches would follow.
- Thanks, via mobile, excuse brevity.
On Dec 22, 2015 7:01 PM, "Jan Holmberg"
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum
Hi,
I’m new to Spark and trying to test first Spark prog. I’m running SparkPi
successfully in yarn-client -mode but when running the same in yarn-mode, app
gets stuck to ACCEPTED phase. I’ve tried hours to hunt down the reason but the
outcome is always the same. Any hints what to look for next?
node?
If you go to the ResourceManager web UI, does it indicate any containers are
running?
-Sandy
On May 19, 2014, at 11:43 PM, Jan Holmberg jan.holmb...@perigeum.fi wrote:
Hi,
I’m new to Spark and trying to test first Spark prog. I’m running SparkPi
successfully in yarn-client
that are fired when I start the Spark run :
Zookeeper : caught end of stream exception
Yarn : The specific max attempts: 0 for application: 1 is invalid, because it
is out of the range [1, 2]. Use the global max attempts
-jan
On 20 May 2014, at 11:14, Jan Holmberg
jan.holmb
14 matches
Mail list logo