RE: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-18 Thread Nasrulla Khan Haris
Was providing IP address instead of FQDN. Providing FQDN helped. Thanks, From: Nasrulla Khan Haris Sent: Wednesday, September 16, 2020 4:11 PM To: dev@spark.apache.org Subject: Spark-Locality: Hinting Spark location of the executor does not take effect. HI Spark developers, If I want to hint

Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers, If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.

Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers, If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.

RE: UnknownSource NullPointerException in CodeGen. with Custom Strategy

2020-06-28 Thread Nasrulla Khan Haris
read.java:748) Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Saturday, June 27, 2020 11:18 PM To: dev@spark.apache.org Subject: UnknownSource NullPointerException in CodeGen. with Custom Strategy HI Spark Developers, Encountering this NullPointerException while reading parquet file in multi-

UnknownSource NullPointerException in CodeGen. with Custom Strategy

2020-06-28 Thread Nasrulla Khan Haris
HI Spark Developers, Encountering this NullPointerException while reading parquet file in multi-node cluster. However while running the spark-job locally on single-node (development environment) not encountering this error. Appreciate your inputs. Thanks in advance, NKH

Datasource with ColumnBatchScan support.

2020-06-15 Thread Nasrulla Khan Haris
HI Spark developers, FileSourceScanExec extends ColumnarBatchScan which internal converts columnarbatch to InternalRows, If I

RE: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-12 Thread Nasrulla Khan Haris
. From: Kris Mo Sent: Friday, June 12, 2020 2:20 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled. Hi Nasrulla, Not sure what your new code is doing, but the symptom looks like you're creating a new data

RE: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-12 Thread Nasrulla Khan Haris
Thanks Kris for your inputs. Yes I have a new data source which wraps around builtin parquet data source. What I do not understand is with WSCG disabled, Output is not columnar batch. From: Kris Mo Sent: Friday, June 12, 2020 2:20 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject

ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-11 Thread Nasrulla Khan Haris
HI Spark developer, I have a new baseRelation which Initializes ParquetFileFormat object and when reading the data I am encountering Cast Exception below, however when I disable codegen support with config "spark.sql.codegen.wholeStage"= false, I do not encounter this exception. 20/06/11

preferredlocations for hadoopfsrelations based baseRelations

2020-06-04 Thread Nasrulla Khan Haris
HI Spark developers, I have created new format extending fileformat. I see getPrefferedLocations is available if newCustomRDD is created. Since fileformat is based off FileScanRDD which uses readfile method to read partitioned file, Is there a way to add desired preferredLocations ?

RE: Adding Custom finalize method to RDDs.

2019-06-12 Thread Nasrulla Khan Haris
, Nasrulla From: Phillip Henry Sent: Tuesday, June 11, 2019 11:28 PM To: Nasrulla Khan Haris Cc: Vinoo Ganesh ; dev@spark.apache.org Subject: Re: Adding Custom finalize method to RDDs. That's not the kind of thing a finalize method was ever supposed to do. Use a try/finally block instead

RE: Adding Custom finalize method to RDDs.

2019-06-11 Thread Nasrulla Khan Haris
I want to delete some files which I created In my datasource api, as soon as the RDD is cleaned up. Thanks, Nasrulla From: Vinoo Ganesh Sent: Monday, June 10, 2019 1:32 PM To: Nasrulla Khan Haris ; dev@spark.apache.org Subject: Re: Adding Custom finalize method to RDDs. Generally overriding

RE: Adding Custom finalize method to RDDs.

2019-06-10 Thread Nasrulla Khan Haris
Hello Everyone, Is there a way to do it from user-code ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Sunday, June 9, 2019 5:30 PM To: dev@spark.apache.org Subject: Adding Custom finalize method to RDDs. Hi All, Is there a way to add custom finalize method to RDD objects to add custom

Adding Custom finalize method to RDDs.

2019-06-09 Thread Nasrulla Khan Haris
Hi All, Is there a way to add custom finalize method to RDD objects to add custom logic when RDDs are destructed by JVM ? Thanks, Nasrulla

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
Thanks Sean, that makes sense. Regards, Nasrulla -Original Message- From: Sean Owen Sent: Tuesday, May 21, 2019 6:24 PM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: RDD object Out of scope. I'm not clear what you're asking. An RDD itself is just an object in the JVM

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
I am trying to find the code that cleans up uncached RDD. Thanks, Nasrulla From: Charoes Sent: Tuesday, May 21, 2019 5:10 PM To: Nasrulla Khan Haris Cc: Wenchen Fan ; dev@spark.apache.org Subject: Re: RDD object Out of scope. If you cached a RDD and hold a reference of that RDD in your code

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
Thanks for reply Wenchen, I am curious as what happens when RDD goes out of scope when it is not cached. Nasrulla From: Wenchen Fan Sent: Tuesday, May 21, 2019 6:28 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: RDD object Out of scope. RDD is kind of a pointer

RDD object Out of scope.

2019-05-20 Thread Nasrulla Khan Haris
HI Spark developers, Can someone point out the code where RDD objects go out of scope ?. I found the contextcleaner code in which only persisted RDDs are cleaned up in regular intervals

RE: adding shutdownmanagerhook to spark.

2019-05-13 Thread Nasrulla Khan Haris
Thanks Sean Nasrulla -Original Message- From: Sean Owen Sent: Monday, May 13, 2019 4:16 PM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: adding shutdownmanagerhook to spark. Spark just adds a hook to the mechanism that Hadoop exposes. You can do the same. You

adding shutdownmanagerhook to spark.

2019-05-13 Thread Nasrulla Khan Haris
HI All, I am trying to add shutdown hook, but looks like shutdown manager object requires the package to be spark only, is there any other API that can help me to do this ? https://github.com/apache/spark/blob/v2.4.0/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala I can see

API for SparkContext ?

2019-05-13 Thread Nasrulla Khan Haris
HI All, Is there a API for sparkContext where we can add our custom code before stopping sparkcontext ? Appreciate your help. Thanks, Nasrulla

Need to clean up temporary files created post dataframe deletion.

2019-05-12 Thread Nasrulla Khan Haris
HI All, I am new to apache spark core, I have custom datasource v2 connector for spark and I am looking for some suggestions deleting temp files created by the connector. Any ideas would helpful. Thanks for your help in advance. Nasrulla

RE: Need guidance on Spark Session Termination.

2019-05-08 Thread Nasrulla Khan Haris
HI All, Any Inputs here ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Tuesday, May 7, 2019 3:58 PM To: dev@spark.apache.org Subject: Need guidance on Spark Session Termination. Hi fellow Spark-devs, I am pretty new to spark core and I am looking for some answers to my use case. I have

Need guidance on Spark Session Termination.

2019-05-07 Thread Nasrulla Khan Haris
Hi fellow Spark-devs, I am pretty new to spark core and I am looking for some answers to my use case. I have a datasource v2 api connector, In my connector we create temporary files on the blob storage. Can you please suggest places where I can look if I want to delete the temporary files on