i’ve an Avro generated class - com.avro.Person which has a method ->
getClassSchema
I’m passing className to a method, and in the method - i need to get the
Avro schema . Here is the code i'm trying to use -
val pr = Class.forName(productCls) //where productCls =
classOf[Product].getName
How
Hello everyone,
I wanted to do something like this:
Given a JavaPairRDD(let's say with 10 rows), I want to store each of the
rows separately with following requirements:
a) Each of them should be a map(Can not use saveAsTextFile)
b) The file name should have the key in it(Eg: If the key is 0,1..
Hi,
This is question regarding timezone conversion with from_utc_timestamp
function.
The observation is that the function return different values for zoneId and
zoneOffset for the same timezone.
Ex: "America/Los_Angeles" and "-08:00"
System Timezone is +05:30
Timestamp: 1519430400
Thanks Jörn,
Fairscheduler is already enabled in yarn-site.xml
yarn.resourcemanager.scheduler.class -
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.scheduler.fair.allow-undeclared-pools -
true
yarn.scheduler.fair.user-as-default-queue
true
Hi Vijay:
I am using spark-shell because I am still prototyping the steps involved.
Regarding executors - I have 280 executors and UI only show a few straggler
tasks on each trigger. The UI does not show too much time spend on GC.
suspect the delay is because of getting data from kafka. The
Fairscheduler in yarn provides you the possibility to use more resources than
configured if they are available
On 24. Feb 2018, at 13:47, akshay naidu wrote:
>> it sure is not able to get sufficient resources from YARN to start the
>> containers.
> that's right. I
>
> it sure is not able to get sufficient resources from YARN to start the
> containers.
>
that's right. I worked when I reduced executors from thrift but it also
reduced thrift's performance.
But it is not the solution i am looking forward to. my sqoop import job
runs just once a day, and thrift