RE: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-18 Thread Nasrulla Khan Haris
Was providing IP address instead of FQDN. Providing FQDN helped. Thanks, From: Nasrulla Khan Haris Sent: Wednesday, September 16, 2020 4:11 PM To: dev@spark.apache.org Subject: Spark-Locality: Hinting Spark location of the executor does not take effect. HI Spark developers, If I want to hint

Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers, If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.

Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers, If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.

Unable to run bash script when using spark-submit in cluster mode.

2020-07-23 Thread Nasrulla Khan Haris
Hi Spark Users, I am trying to execute bash script from my spark app. I can run the below command without issues from spark-shell however when I use it in the spark-app and submit with spark-submit, container is not able to find the directories. val result = "export LD_LIBRARY_PATH=/

RE: Unable to run bash script when using spark-submit in cluster mode.

2020-07-23 Thread Nasrulla Khan Haris
Are local paths not exposed in containers ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Thursday, July 23, 2020 6:13 PM To: user@spark.apache.org Subject: Unable to run bash script when using spark-submit in cluster mode. Importance: High Hi Spark Users, I am trying to execute bash

RE: UnknownSource NullPointerException in CodeGen. with Custom Strategy

2020-06-28 Thread Nasrulla Khan Haris
read.java:748) Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Saturday, June 27, 2020 11:18 PM To: dev@spark.apache.org Subject: UnknownSource NullPointerException in CodeGen. with Custom Strategy HI Spark Developers, Encountering this NullPointerException while reading parquet file in multi-

UnknownSource NullPointerException in CodeGen. with Custom Strategy

2020-06-28 Thread Nasrulla Khan Haris
HI Spark Developers, Encountering this NullPointerException while reading parquet file in multi-node cluster. However while running the spark-job locally on single-node (development environment) not encountering this error. Appreciate your inputs. Thanks in advance, NKH

Datasource with ColumnBatchScan support.

2020-06-15 Thread Nasrulla Khan Haris
HI Spark developers, FileSourceScanExec extends ColumnarBatchScan which internal converts columnarbatch to InternalRows, If I

RE: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-12 Thread Nasrulla Khan Haris
. From: Kris Mo Sent: Friday, June 12, 2020 2:20 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled. Hi Nasrulla, Not sure what your new code is doing, but the symptom looks like you're creating a new data

RE: [EXTERNAL] Re: ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-12 Thread Nasrulla Khan Haris
Thanks Kris for your inputs. Yes I have a new data source which wraps around builtin parquet data source. What I do not understand is with WSCG disabled, Output is not columnar batch. From: Kris Mo Sent: Friday, June 12, 2020 2:20 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject

ColumnnarBatch to InternalRow Cast exception with codegen enabled.

2020-06-11 Thread Nasrulla Khan Haris
HI Spark developer, I have a new baseRelation which Initializes ParquetFileFormat object and when reading the data I am encountering Cast Exception below, however when I disable codegen support with config "spark.sql.codegen.wholeStage"= false, I do not encounter this exception. 20/06/11

RE: Does Spark SQL support GRANT/REVOKE operations on Tables?

2020-06-10 Thread Nasrulla Khan Haris
I did enable auth related configs in hive-site.xml as per below document. I tried this on Spark 2.4.4. Is it supported ? https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+Metastore+Server From: Nasrulla Khan Haris Sent: Wednesday, June 10, 2020 5:55 PM

Does Spark SQL support GRANT/REVOKE operations on Tables?

2020-06-10 Thread Nasrulla Khan Haris
HI Spark users, I see REVOKE/GRANT operations In list of supported operations but when I run the on a table. I see Error: org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: GRANT(line 1, pos 0) == SQL == GRANT INSERT ON table_priv1 TO USER user2 ^^^ at

preferredlocations for hadoopfsrelations based baseRelations

2020-06-04 Thread Nasrulla Khan Haris
HI Spark developers, I have created new format extending fileformat. I see getPrefferedLocations is available if newCustomRDD is created. Since fileformat is based off FileScanRDD which uses readfile method to read partitioned file, Is there a way to add desired preferredLocations ?

RE: Adding Custom finalize method to RDDs.

2019-06-12 Thread Nasrulla Khan Haris
, Nasrulla From: Phillip Henry Sent: Tuesday, June 11, 2019 11:28 PM To: Nasrulla Khan Haris Cc: Vinoo Ganesh ; dev@spark.apache.org Subject: Re: Adding Custom finalize method to RDDs. That's not the kind of thing a finalize method was ever supposed to do. Use a try/finally block instead

RE: Adding Custom finalize method to RDDs.

2019-06-11 Thread Nasrulla Khan Haris
I want to delete some files which I created In my datasource api, as soon as the RDD is cleaned up. Thanks, Nasrulla From: Vinoo Ganesh Sent: Monday, June 10, 2019 1:32 PM To: Nasrulla Khan Haris ; dev@spark.apache.org Subject: Re: Adding Custom finalize method to RDDs. Generally overriding

RE: Adding Custom finalize method to RDDs.

2019-06-10 Thread Nasrulla Khan Haris
Hello Everyone, Is there a way to do it from user-code ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Sunday, June 9, 2019 5:30 PM To: dev@spark.apache.org Subject: Adding Custom finalize method to RDDs. Hi All, Is there a way to add custom finalize method to RDD objects to add custom

Adding Custom finalize method to RDDs.

2019-06-09 Thread Nasrulla Khan Haris
Hi All, Is there a way to add custom finalize method to RDD objects to add custom logic when RDDs are destructed by JVM ? Thanks, Nasrulla

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
Thanks Sean, that makes sense. Regards, Nasrulla -Original Message- From: Sean Owen Sent: Tuesday, May 21, 2019 6:24 PM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: RDD object Out of scope. I'm not clear what you're asking. An RDD itself is just an object in the JVM

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
I am trying to find the code that cleans up uncached RDD. Thanks, Nasrulla From: Charoes Sent: Tuesday, May 21, 2019 5:10 PM To: Nasrulla Khan Haris Cc: Wenchen Fan ; dev@spark.apache.org Subject: Re: RDD object Out of scope. If you cached a RDD and hold a reference of that RDD in your code

RE: RDD object Out of scope.

2019-05-21 Thread Nasrulla Khan Haris
Thanks for reply Wenchen, I am curious as what happens when RDD goes out of scope when it is not cached. Nasrulla From: Wenchen Fan Sent: Tuesday, May 21, 2019 6:28 AM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: RDD object Out of scope. RDD is kind of a pointer

RDD object Out of scope.

2019-05-20 Thread Nasrulla Khan Haris
HI Spark developers, Can someone point out the code where RDD objects go out of scope ?. I found the contextcleaner code in which only persisted RDDs are cleaned up in regular intervals

RE: adding shutdownmanagerhook to spark.

2019-05-13 Thread Nasrulla Khan Haris
Thanks Sean Nasrulla -Original Message- From: Sean Owen Sent: Monday, May 13, 2019 4:16 PM To: Nasrulla Khan Haris Cc: dev@spark.apache.org Subject: Re: adding shutdownmanagerhook to spark. Spark just adds a hook to the mechanism that Hadoop exposes. You can do the same. You

adding shutdownmanagerhook to spark.

2019-05-13 Thread Nasrulla Khan Haris
HI All, I am trying to add shutdown hook, but looks like shutdown manager object requires the package to be spark only, is there any other API that can help me to do this ? https://github.com/apache/spark/blob/v2.4.0/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala I can see

API for SparkContext ?

2019-05-13 Thread Nasrulla Khan Haris
HI All, Is there a API for sparkContext where we can add our custom code before stopping sparkcontext ? Appreciate your help. Thanks, Nasrulla

Need to clean up temporary files created post dataframe deletion.

2019-05-12 Thread Nasrulla Khan Haris
HI All, I am new to apache spark core, I have custom datasource v2 connector for spark and I am looking for some suggestions deleting temp files created by the connector. Any ideas would helpful. Thanks for your help in advance. Nasrulla

RE: Need guidance on Spark Session Termination.

2019-05-08 Thread Nasrulla Khan Haris
HI All, Any Inputs here ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Tuesday, May 7, 2019 3:58 PM To: dev@spark.apache.org Subject: Need guidance on Spark Session Termination. Hi fellow Spark-devs, I am pretty new to spark core and I am looking for some answers to my use case. I have

Need guidance on Spark Session Termination.

2019-05-07 Thread Nasrulla Khan Haris
Hi fellow Spark-devs, I am pretty new to spark core and I am looking for some answers to my use case. I have a datasource v2 api connector, In my connector we create temporary files on the blob storage. Can you please suggest places where I can look if I want to delete the temporary files on

[jira] [Created] (HIVE-20866) Result Caching doesnt work if order of the columns is changed in the query.

2018-11-04 Thread Nasrulla Khan Haris (JIRA)
Nasrulla Khan Haris created HIVE-20866: -- Summary: Result Caching doesnt work if order of the columns is changed in the query. Key: HIVE-20866 URL: https://issues.apache.org/jira/browse/HIVE-20866

[jira] [Created] (HIVE-20865) Query Result caching for queries which contain subqueries which are cached.

2018-11-04 Thread Nasrulla Khan Haris (JIRA)
Nasrulla Khan Haris created HIVE-20865: -- Summary: Query Result caching for queries which contain subqueries which are cached. Key: HIVE-20865 URL: https://issues.apache.org/jira/browse/HIVE-20865