In general, you should implement thread-safety in your code.
Which set of events are you interested in ?
Cheers
On Fri, Apr 1, 2016 at 9:23 AM, Truong Duc Kien
wrote:
> Hi,
>
> I need to gather some metrics using a SparkListener. Does the callback
> methods need to
Looks like this is result of the following check:
val shouldReplace = output.exists(f => resolver(f.name, colName))
if (shouldReplace) {
where existing column, text, was replaced.
On Thu, Mar 31, 2016 at 12:08 PM, Jacek Laskowski wrote:
> Hi,
>
> Just ran into the
Can you show the stack trace ?
The log message came from
DiskBlockObjectWriter#revertPartialWritesAndClose().
Unfortunately, the method doesn't throw exception, making it a bit hard for
caller to know of the disk full condition.
On Thu, Mar 31, 2016 at 11:32 AM, Abhishek Anand
Please exclude jackson-databind - that was where the AnnotationMap class
comes from.
On Thu, Mar 31, 2016 at 11:37 AM, Marcelo Oikawa <
marcelo.oik...@webradar.com> wrote:
> Hi, Alonso.
>
> As you can see jackson-core is provided by several libraries, try to
>> exclude it from spark-core, i
Spark 1.6.1 uses this version of jackson:
2.4.4
Looks like Tranquility uses different version of jackson.
How do you build your jar ?
Consider using maven-shade-plugin to resolve the conflict if you use maven.
Cheers
On Thu, Mar 31, 2016 at 9:50 AM, Marcelo Oikawa
I tried this:
scala> final case class Text(id: Int, text: String)
warning: there was one unchecked warning; re-run with -unchecked for details
defined class Text
scala> val ds = Seq(Text(0, "hello"), Text(1, "world")).toDF.as[Text]
ds: org.apache.spark.sql.Dataset[Text] = [id: int, text: string]
Looking through
https://spark.apache.org/docs/latest/configuration.html#spark-streaming , I
don't see config specific to YARN.
Can you pastebin the exception you saw ?
When the job stopped, was there any error ?
Thanks
On Wed, Mar 30, 2016 at 10:57 PM, Soni spark
How did you specify the packages ?
See the following from
https://spark.apache.org/docs/latest/submitting-applications.html :
Users may also include any other dependencies by supplying a
comma-delimited list of maven coordinates with --packages.
On Wed, Mar 30, 2016 at 7:15 AM, Mustafa Elbehery
Have you tried the following construct ?
new OrderedRDDFunctions[K, V, (K, V)](rdd).sortByKey()
See core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
On Wed, Mar 30, 2016 at 5:20 AM, Nirav Patel wrote:
> Hi, I am trying to use filterByRange feature of
-c CORES, --cores CORES Total CPU cores to allow Spark applications to use
on the machine (default: all available); only on worker
bq. sc.getConf().set()
I think you should use this pattern (shown in
https://spark.apache.org/docs/latest/spark-standalone.html):
val conf = new SparkConf()
As the error said, com.sap.db.jdbc.topology.Host is not serializable.
Maybe post question on Sap Hana mailing list (if any) ?
On Tue, Mar 29, 2016 at 7:54 AM, reena upadhyay <
reena.upadh...@impetus.co.in> wrote:
> I am trying to execute query using spark sql on SAP HANA from spark
> shell. I
Can you disclose snippet of your code ?
Which Spark release do you use ?
Thanks
> On Mar 29, 2016, at 3:42 AM, Charan Adabala wrote:
>
> From the below image how can we reduce the computing time for the stages, at
> some stages the Executor Computing Time is less than
See this method:
lazy val rdd: RDD[T] = {
On Mon, Mar 28, 2016 at 6:30 PM, Russell Jurney
wrote:
> Ok, I'm also unable to save to Elasticsearch using a dataframe's RDD. This
> seems related to DataFrames. Is there a way to convert a DataFrame's RDD to
> a 'normal'
Can you describe what gets triggered by triggerAndWait ?
Cheers
On Mon, Mar 28, 2016 at 1:39 PM, kpeng1 wrote:
> Hi All,
>
> I am currently trying to debug a spark application written in scala. I
> have
> a main method:
> def main(args: Array[String]) {
> ...
>
Dropping dev@
Can you provide a bit more information ?
release of hbase
release of hadoop
I assume you're running on Linux.
Any change in Linux setup before the exception showed up ?
On Mon, Mar 28, 2016 at 10:30 AM, beeshma r wrote:
> Hi
> i am testing with newly build
Can you describe your use case a bit more ?
Since the row keys are not sorted in your example, there is a chance that
you get indeterministic results when you aggregate on groups of two
successive rows.
Thanks
On Mon, Mar 28, 2016 at 9:21 AM, sujeet jog wrote:
> Hi,
>
>
tion and add the MyRDD.scala to the
>> project then the custom method can be called in the main function and it
>> works.
>> I misunderstand the usage of custom rdd, the custom rdd does not have to be
>> written to the spark project like UnionRDD, CogroupedRDD, and
; *4 myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3]* at
> customable at
> 5 :28
> 6 scala> *myrdd.customMethod(bulk)*
> *7 error: value customMethod is not a member of
> org.apache.spark.rdd.RDD[(Int, String)]*
>
> On Mon, Mar 28, 2016 at 12:50 AM, Ted Yu <yu
dd.customMethod(bulk)*
> *error: value customMethod is not a member of
> org.apache.spark.rdd.RDD[(Int, String)]*
>
> and the customable method in PairRDDFunctions.scala is
>
> def customable(partitioner: Partitioner): RDD[(K, V)] = self.withScope {
> new MyRDD[K, V](self, partition
Can you show the full stack trace (or top 10 lines) and the snippet using
your MyRDD ?
Thanks
On Sun, Mar 27, 2016 at 9:22 AM, Tenghuan He wrote:
> Hi everyone,
>
> I am creating a custom RDD which extends RDD and add a custom method,
> however the custom method
Please take a look at the MyRDD class in:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
There is scaladoc for the class. See how getPreferredLocations() is
implemented.
Cheers
On Sun, Mar 27, 2016 at 2:01 AM, chenyong wrote:
> Thank you Ted for
According to:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/bk_HDP_RelNotes-20151221.pdf
Spark 1.5.2 comes out of box.
Suggest moving questions on HDP to Hortonworks forum.
Cheers
On Sat, Mar 26, 2016 at 3:32 PM, Mich Talebzadeh
wrote:
>
Please take a look at the following method:
/**
* Get the preferred locations of a partition, taking into account
whether the
* RDD is checkpointed.
*/
final def preferredLocations(split: Partition): Seq[String] = {
checkpointRDD.map(_.getPreferredLocations(split)).getOrElse {
%20Scala.pdf
>
>
> -- Forwarded message --
> From: Ted Yu <yuzhih...@gmail.com>
> Date: 26 March 2016 at 12:51
> Subject: Re: Any plans to migrate Transformer API to Spark SQL (closer to
> DataFrames)?
> To: Michał Zieliński <zielinski.mich...@gmail.co
Same with master branch.
I found derby.log in the following two files:
.gitignore:derby.log
dev/.rat-excludes:derby.log
FYI
On Sat, Mar 26, 2016 at 4:09 AM, Mich Talebzadeh
wrote:
> Having moved to Spark 1.6.1, I have noticed thar whenerver I start a
> spark-sql or
Session management has improved in 1.6.x (see SPARK-10810)
Mind giving 1.6.1 a try ?
Thanks
On Fri, Mar 25, 2016 at 3:48 PM, Mich Talebzadeh
wrote:
> I have noticed that the only sure way to specify a Hive table from Spark
> is to prefix it with database (DBName)
Jd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 25 March 2016 at 22:40, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Looks like database support
finds and gets the table
> info OK
>
> HTH
>
>
> On Friday, 25 March 2016, 22:32, Ted Yu <yuzhih...@gmail.com> wrote:
>
>
> Which release of Spark do you use, Mich ?
>
> In master branch, the message is more accurate
> (sql/catalyst/src/main/scala/org/apa
Which release of Spark do you use, Mich ?
In master branch, the message is more accurate
(sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NoSuchItemException.scala):
override def getMessage: String = s"Table $table not found in database
$db"
On Fri, Mar 25, 2016 at 3:21
efinitely solve my problem
>
> I have one more question .. if i want to launch a spark application in
> production environment so is there any other way so multiple users can
> submit there job without having hadoop configuration .
>
> Regards
> Prateek
>
>
> On
See this thread:
http://search-hadoop.com/m/q3RTtAvwgE7dEI02
On Fri, Mar 25, 2016 at 10:39 AM, prateek arora
wrote:
> Hi
>
> I want to submit spark application from outside of spark clusters . so
> please help me to provide a information regarding this.
>
>
This is the original subject of the JIRA:
Partition discovery fail if there is a _SUCCESS file in the table's root dir
If I remember correctly, there were discussions on how (traditional)
partition discovery slowed down Spark jobs.
Cheers
On Fri, Mar 25, 2016 at 10:15 AM, suresk
Do you mind showing body of TO_DATE() ?
Thanks
On Fri, Mar 25, 2016 at 7:38 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> Looks like you forgot an import for Date.
>
> FYI
>
> On Fri, Mar 25, 2016 at 7:36 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
Looks like you forgot an import for Date.
FYI
On Fri, Mar 25, 2016 at 7:36 AM, Mich Talebzadeh
wrote:
>
>
> Hi,
>
> writing a UDF to convert a string into Date
>
> def ChangeDate(word : String) : Date = {
> | return
>
Here is the doc for defaultParallelism :
/** Default level of parallelism to use when not given by user (e.g.
parallelize and makeRDD). */
def defaultParallelism: Int = {
What if the user changes parallelism ?
Cheers
On Fri, Mar 25, 2016 at 5:33 AM, manasdebashiskar
the chain being
> purely transformations, then checkpointing instead of saving still wouldn't
> execute any action on the RDD -- it would just mark the point at which
> checkpointing should be done when an action is eventually run.
>
> On Wed, Mar 23, 2016 at 7:38 PM, Ted Yu <yuzhih...@gmail.com&g
bq. getServletHandlers is not intended for public use
>From MetricsSystem.scala :
private[spark] class MetricsSystem private (
Looks like there is no easy way to extend REST API.
On Thu, Mar 24, 2016 at 1:09 PM, Sebastian Kochman <
sebastian.koch...@outlook.com> wrote:
> Hello,
> I have a
ery...").show() /** works */
>
> sqlContext.cacheTable("table")
>
> sqlContext.sql("...complex query...").show() /** works */
>
> sqlContext.sql("...complex query...").show() /** fails */
>
>
>
> On 24.03.2016 13:40, Ted Yu wrote:
lue()
> {
> return new Jedis("10.101.41.19",6379);
> }
> };
>
>
>
> and then
>
>
>
> .foreachRDD(new VoidFunction<JavaRDD>() {
> public void call(JavaRDD rdd) throws Exception {
>
> for (TopData t: rdd.take(t
s
> jedis
> 2.8.0
> jar
> compile
>
>
>
>
>
>
>
> How can I look at those tasks?
>
>
>
> *Van:* Ted Yu [mailto:yuzhih...@gmail.com]
> *Verzonden:* donderdag 24 maart 2016 14:33
> *Aan:* Michel Hubert <mich...@
Which release of Spark are you using ?
Have you looked the tasks whose Ids were printed to see if there was more
clue ?
Thanks
On Thu, Mar 24, 2016 at 6:12 AM, Michel Hubert wrote:
> HI,
>
>
>
> I constantly get these errors:
>
>
>
> 0[Executor task launch worker-15]
Can you pastebin the stack trace ?
If you can show snippet of your code, that would help give us more clue.
Thanks
> On Mar 24, 2016, at 2:43 AM, Mohamed Nadjib MAMI wrote:
>
> Hi all,
> I'm running SQL queries (sqlContext.sql()) on Parquet tables and facing a
>
Consider using their forum:
https://discuss.elastic.co/c/elasticsearch
> On Mar 24, 2016, at 3:09 AM, Jakub Stransky wrote:
>
> Hi,
>
> I am trying to write JavaPairRDD into elasticsearch 1.7 using spark 1.2.1
> using elasticsearch-hadoop 2.0.2
>
>
> before any job" comment pose any restriction in this case since no jobs
> have yet been executed on the RDD.
>
> On Wed, Mar 23, 2016 at 7:18 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> See the doc for checkpoint:
>>
>>* Mark this RDD
See the doc for checkpoint:
* Mark this RDD for checkpointing. It will be saved to a file inside the
checkpoint
* directory set with `SparkContext#setCheckpointDir` and all references
to its parent
* RDDs will be removed. *This function must be called before any job has
been*
* *
SPARK-13383 is fixed in 2.0 only, as of this moment.
Any chance of backporting to branch-1.6 ?
Thanks
On Wed, Mar 23, 2016 at 4:20 PM, Davies Liu wrote:
> On Wed, Mar 23, 2016 at 10:35 AM, Yong Zhang wrote:
> > Here is the output:
> >
> > ==
Please see:
https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani
which references https://github.com/SparklineData/spark-druid-olap
On Wed, Mar 23, 2016 at 5:59 AM, Raymond Honderdors <
raymond.honderd...@sizmek.com> wrote:
> Does anyone have a good
See related thread:
http://search-hadoop.com/m/q3RTtuwg442GBwKh
On Tue, Mar 22, 2016 at 12:13 PM, Mike Sukmanowsky <
mike.sukmanow...@gmail.com> wrote:
> The Source class is private
>
Can you show code snippet and the exception for 'Task is not serializable' ?
Please see related JIRA:
SPARK-10251
whose pull request contains code for registering classes with Kryo.
Cheers
On Tue, Mar 22, 2016 at 7:00 AM, Hafsa Asif
wrote:
> Hello,
> I am facing
The NullPointerEx came from Spring.
Which version of Spring do you use ?
Thanks
> On Mar 22, 2016, at 6:08 AM, Hafsa Asif wrote:
>
> yes I know it is because of NullPointerEception, but could not understand
> why?
> The complete stack trace is :
> [2016-03-22
bq. Caused by: java.lang.NullPointerException
Can you show the remaining stack trace ?
Thanks
On Tue, Mar 22, 2016 at 5:43 AM, Hafsa Asif
wrote:
> Hello everyone,
> I am trying to get benefits of DataFrames (to perform all SQL BASED
> operations like 'Where Clause',
Can you tell us the commit hash your workspace is based on ?
On Mon, Mar 21, 2016 at 8:05 PM, Gayathri Murali <
gayathri.m.sof...@gmail.com> wrote:
> Hi All,
>
> I am trying to run ./python/run-tests on my local master branch. I am
> getting the following error. I have run this multiple times
gt;
>
>
>
>
> 2016-03-21 15:24 GMT+02:00 Ted Yu <yuzhih...@gmail.com>:
>
>> Was speculative execution enabled ?
>>
>> Thanks
>>
>> On Mar 21, 2016, at 6:19 AM, Yasemin Kaya <godo...@gmail.com> wrote:
>>
>> Hi,
>>
>>
Can you provide a bit more information ?
Release of Spark and YARN
Have you checked Spark UI / YARN job log to see if there is some clue ?
Cheers
On Mon, Mar 21, 2016 at 6:21 AM, Roberto Pagliari wrote:
> I noticed that sometimes the spark cluster seems to restart
Was speculative execution enabled ?
Thanks
> On Mar 21, 2016, at 6:19 AM, Yasemin Kaya wrote:
>
> Hi,
>
> I am using S3 read data also I want to save my model S3. In reading part
> there is no error, but when I save model I am getting this error . I tried to
> change the
To speed up the build process, take a look at install_zinc() in build/mvn,
around line 83.
And the following around line 137:
# Now that zinc is ensured to be installed, check its status and, if its
# not running or just installed, start it
FYI
On Sun, Mar 20, 2016 at 7:44 PM, Tenghuan He
g.apache.spark.sql.DataFrame = [Payment date: string, Net: string,
> VAT: string, InvoiceNumber: int]
>
>
> HTH
>
>
>
>
>
>
>
>
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8P
Please refer to the following methods of DataFrame:
def withColumn(colName: String, col: Column): DataFrame = {
def drop(colName: String): DataFrame = {
On Sun, Mar 20, 2016 at 2:47 PM, Ashok Kumar
wrote:
> Gurus,
>
> I would like to read a csv file into a
$ jar tvf
./external/flume-sink/target/spark-streaming-flume-sink_2.10-1.6.1.jar |
grep SparkFlumeProtocol
841 Thu Mar 03 11:09:36 PST 2016
org/apache/spark/streaming/flume/sink/SparkFlumeProtocol$Callback.class
2363 Thu Mar 03 11:09:36 PST 2016
I took a look at docs/configuration.md
Though I didn't find answer for your first question, I think the following
pertains to your second question:
spark.python.worker.memory
512m
Amount of memory to use per python worker process during aggregation,
in the same
format as JVM
Not that I know of.
Can you be a little more specific on which JVM(s) you want restarted
(assuming spark-submit is used to start a second job) ?
Thanks
On Sun, Mar 20, 2016 at 6:20 AM, Udo Fholl wrote:
> Hi all,
>
> Is there a way for spark-submit to restart the JVM in
For your last point, spark-submit has:
if [ -z "${SPARK_HOME}" ]; then
export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi
Meaning the script would determine the proper SPARK_HOME variable.
FYI
On Wed, Mar 16, 2016 at 4:22 AM, Леонид Поляков wrote:
> Hello, guys!
>
>
>
Can you give a bit more detail ?
Release of Spark
symptom of renamed column being not recognized
Please take a look at "withColumnRenamed" test in:
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
On Thu, Mar 17, 2016 at 2:02 AM, Divya Gehlot
wrote:
Can you call sc.stop() after indexing into elastic search ?
> On Mar 16, 2016, at 9:17 PM, Imre Nagi wrote:
>
> Hi,
>
> I have a spark application for batch processing in standalone cluster. The
> job is to query the database and then do some transformation,
I looked at the places in SparkContext.scala where NewHadoopRDD is
constrcuted.
It seems the Configuration object shouldn't be null.
Which hbase release are you using (so that I can see which line the NPE
came from) ?
Thanks
On Fri, Mar 18, 2016 at 8:05 AM, Lubomir Nerad
It is defined in:
core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
On Thu, Mar 17, 2016 at 8:55 PM, Shishir Anshuman wrote:
> I am using following code snippet in scala:
>
>
> *val dict: RDD[String] = sc.textFile("path/to/csv/file")*
> *val
bq. .drop("Col9")
Could it be due to the above ?
On Wed, Mar 16, 2016 at 7:29 PM, Divya Gehlot
wrote:
> Hi,
> I am dynamically doing union all and adding new column too
>
> val dfresult =
>>
This is the line where NPE came from:
if (conf.get(SCAN) != null) {
So Configuration instance was null.
On Fri, Mar 18, 2016 at 9:58 AM, Lubomir Nerad <lubomir.ne...@oracle.com>
wrote:
> The HBase version is 1.0.1.1.
>
> Thanks,
> Lubo
>
>
> On 18.3.2016 17:29,
It turned out that Col1 appeared twice in the select :-)
> On Mar 16, 2016, at 7:29 PM, Divya Gehlot wrote:
>
> Hi,
> I am dynamically doing union all and adding new column too
>
>> val dfresult =
>>
bq. $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
Do you mind showing more of your code involving the map() ?
On Thu, Mar 17, 2016 at 8:32 AM, Dirceu Semighini Filho <
dirceu.semigh...@gmail.com> wrote:
> Hello,
> I found a strange behavior after executing a prediction with MLIB.
> My code
Which version of hadoop do you use ?
bq. Requesting to kill executor(s) 1136
Can you find more information on executor 1136 ?
Thanks
On Fri, Mar 18, 2016 at 4:16 PM, Nezih Yigitbasi <
nyigitb...@netflix.com.invalid> wrote:
> Hi Spark experts,
> I am using Spark 1.5.2 on YARN with dynamic
Can you utilize this function of DistributedLDAModel ?
override protected def getModel: OldLDAModel = oldDistributedModel
cheers
On Fri, Mar 18, 2016 at 7:34 AM, cindymc wrote:
> I like using the new DataFrame APIs on Spark ML, compared to using RDDs in
> the
Please take a look at:
core/src/test/scala/org/apache/spark/AccumulatorSuite.scala
FYI
On Tue, Mar 15, 2016 at 4:29 PM, SRK wrote:
> Hi,
>
> How do I add an accumulator for a Set in Spark?
>
> Thanks!
>
>
>
> --
> View this message in context:
>
t;
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
> http://talebzadehmich.wordpress.com
>
>
>> On 15 March 2016 at 23:17, Ted Yu <yuzhih...@gmail.com> wrote:
>> 1.0
>> ...
>> scala
>>
>>> On Tue, Mar 15
1.0
...
scala
On Tue, Mar 15, 2016 at 4:14 PM, Mich Talebzadeh
wrote:
> An observation
>
> Once compiled with MVN the job submit works as follows:
>
> + /usr/lib/spark-1.5.2-bin-hadoop2.6/bin/spark-submit --packages
> com.databricks:spark-csv_2.11:1.3.0 --class
Which version of Spark are you using ?
Can you show the code snippet w.r.t. broadcast variable ?
Thanks
On Tue, Mar 15, 2016 at 6:04 AM, manasdebashiskar
wrote:
> Hi,
> I have a streaming application that takes data from a kafka topic and uses
> mapwithstate.
> After
Looks like there is no such capability yet.
How would you specify which rdd's to compress ?
Thanks
> On Mar 15, 2016, at 4:03 AM, Nirav Patel wrote:
>
> Hi,
>
> I see that there's following spark config to compress an RDD. My guess is it
> will compress all RDDs of
I did a quick search but haven't found JIRA in this regard.
If configuration is separate from checkpoint data, more use cases can be
accommodated.
> On Mar 15, 2016, at 2:21 AM, Saisai Shao wrote:
>
> Currently configuration is a part of checkpoint data, and when
There're build jobs for both on Jenkins:
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/
You can choose either one.
I use mvn.
On Tue, Mar 15, 2016 at 3:42 AM, Mich Talebzadeh
Please refer to JIRAs which were related to MiMa
e.g.
[SPARK-13834][BUILD] Update sbt and sbt plugins for 2.x.
It would be easier for other people to help if you provide link to your PR.
Cheers
On Mon, Mar 14, 2016 at 7:22 PM, Gayathri Murali <
gayathri.m.sof...@gmail.com> wrote:
> Hi All,
>
>
ject.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
> org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> at
> org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
> at
Which Spark release do you use ?
For NoSuchElementException, was there anything else in the stack trace ?
Thanks
On Mon, Mar 14, 2016 at 12:12 PM, Boric Tan
wrote:
> Hi there,
>
> I was trying to access application information with REST API. Looks like
> the
> top
Summarizing an offline message:
The following worked for Divya:
dffiltered = dffiltered.unionAll(dfresult.filter ...
On Mon, Mar 14, 2016 at 5:54 AM, Lohith Samaga M
wrote:
> If all sql results have same set of columns you could UNION all the
> dataframes
>
> Create
Here is related code:
final int length = totalSize() + neededSize;
if (buffer.length < length) {
// This will not happen frequently, because the buffer is re-used.
final byte[] tmp = new byte[length * 2];
Looks like length was positive (since it was bigger than buffer.length)
dffiltered = unionAll(dfresult.filter(dfFilterSQLs(
i)).select("Col1","Col2","Col3","Col4","Col5"))
FYI
On Sun, Mar 13, 2016 at 8:50 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Have you tried unionAll() method of DataFrame ?
&g
Have you tried unionAll() method of DataFrame ?
On Sun, Mar 13, 2016 at 8:44 PM, Divya Gehlot
wrote:
> Hi,
>
> Please bear me for asking such a naive question
> I have list of conditions (dynamic sqls) sitting in hbase table .
> I need to iterate through those dynamic
The backport would be done under HBASE-14160.
FYI
On Sun, Mar 13, 2016 at 4:14 PM, Benjamin Kim <bbuil...@gmail.com> wrote:
> Ted,
>
> Is there anything in the works or are there tasks already to do the
> back-porting?
>
> Just curious.
>
> Thanks,
> Ben
>
&
ies
> were introduced? If so, I can pulled that version at time and try again.
>
> Thanks,
> Ben
>
> On Mar 13, 2016, at 12:42 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Benjamin:
> Since hbase-spark is in its own module, you can pull the whole hbase-spark
> subtree into
t; By the way, we are using Spark 1.6 if it matters.
>
> Thanks,
> Ben
>
> On Feb 10, 2016, at 2:34 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Have you tried adding hbase client jars to spark.executor.extraClassPath ?
>
> Cheers
>
> On Wed, Feb 10, 2016 at 12:17 A
Interesting.
If kv._1 was null, shouldn't the NPE have come from getPartition() (line
105) ?
Was it possible that records.next() returned null ?
On Fri, Mar 11, 2016 at 11:20 PM, Prabhu Joseph
wrote:
> Looking at ExternalSorter.scala line 192, i suspect some input
Which Spark release do you use ?
I wonder if the following may have fixed the problem:
SPARK-8029 Robust shuffle writer
JIRA is down, cannot check now.
On Fri, Mar 11, 2016 at 11:01 PM, Saurabh Guru
wrote:
> I am seeing the following exception in my Spark Cluster every
KafkaLZ4BlockOutputStream is in kafka-clients jar :
$ jar tvf kafka-clients-0.8.2.0.jar | grep KafkaLZ4BlockOutputStream
1609 Wed Jan 28 22:30:36 PST 2015
org/apache/kafka/common/message/KafkaLZ4BlockOutputStream$BD.class
2918 Wed Jan 28 22:30:36 PST 2015
Looks like Scala version mismatch.
Are you using 2.11 everywhere ?
On Fri, Mar 11, 2016 at 10:33 AM, vasu20 wrote:
> Hi
>
> Any help appreciated on this. I am trying to write a Spark program using
> IntelliJ. I get a run time error as soon as new SparkConf() is called from
Please take a look at SPARK-3376 and discussion on
https://github.com/apache/spark/pull/5403
FYI
On Fri, Mar 11, 2016 at 6:37 AM, Xudong Zheng wrote:
> Hi all,
>
> Does Spark support in-memory shuffling now? If not, is there any
> consideration for it?
>
> Thanks!
>
> --
>
temporary tables are associated with SessionState which is used
by SQLContext.
Did you keep the session ?
Cheers
On Fri, Mar 11, 2016 at 5:02 AM, ram kumar wrote:
> Hi,
>
> I registered a dataframe as a table using registerTempTable
> and I didn't close the Spark
thod. Everything is working there.
>
> *pastebin of log:* http://pastebin.com/0LjTWLfm
>
>
> Thanks
> Shams
>
> On Thu, Mar 10, 2016 at 8:11 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Can you provide a bit more information ?
>>
>> Release of Spark
Can you provide a bit more information ?
Release of Spark
command for submitting your app
code snippet of your app
pastebin of log
Thanks
On Thu, Mar 10, 2016 at 6:32 AM, Shams ul Haque wrote:
> Hi,
>
> I have developed a spark realtime app and started spark-standalone on
bq. Question is how to get maven repository
As you may have noted, version has SNAPSHOT in it.
Please checkout latest code from master branch and build it yourself.
2.0 release is still a few months away - though backport of hbase-spark
module should come in 1.3 release.
On Wed, Mar 9, 2016 at
bq. drwxr-xr-x - tomcat7 supergroup 0 2016-03-09 23:16 /tmp/swg
If I read the above line correctly, the size of the file was 0.
On Wed, Mar 9, 2016 at 10:00 AM, srimugunthan dhandapani <
srimugunthan.dhandap...@gmail.com> wrote:
> Hi all
> I am working in cloudera CDH5.6 and
bq. there is a varying number of items for that record
If the combination of items is very large, using case class would be
tedious.
On Wed, Mar 9, 2016 at 9:57 AM, Saurabh Bajaj
wrote:
> You can load that binary up as a String RDD, then map over that RDD and
> convert
301 - 400 of 1611 matches
Mail list logo