unsubscribe

2023-10-18 Thread ankur

unsubscribe

2023-09-13 Thread ankur

Unsubscribe

2023-03-26 Thread ankur
unsubscribe

Re: [DISCUSS] Enable blacklisting feature by default in 3.0

2019-04-02 Thread Ankur Gupta
t sure if this is necessary to move forward though) Thanks for all the feedback. Please let me know if there are other concerns that we would like to resolve before enabling blacklisting. Thanks, Ankur On Tue, Apr 2, 2019 at 2:45 AM Steve Loughran wrote: > > > On Fri, Mar 29, 2019 at 6:18 PM Rey

Re: [DISCUSS] Enable blacklisting feature by default in 3.0

2019-04-01 Thread Ankur Gupta
ady. On Mon, Apr 1, 2019 at 3:08 PM Chris Stevens wrote: > Hey Ankur, > > I think the significant decrease in "spark.blacklist.timeout" (1 hr down > to 5 minutes) in your updated suggestion is the key here. > > Looking at a few *successful* runs of the application I wa

Re: [DISCUSS] Enable blacklisting feature by default in 3.0

2019-04-01 Thread Ankur Gupta
saves lot of unnecessary computation and also alerts admins to look for transient/permanent hardware failures. Please let me know if you think, we should enable blacklisting feature by default with the higher threshold. Thanks, Ankur On Fri, Mar 29, 2019 at 3:23 PM Chris Stevens wrote: > Hey All, > &g

Re: [DISCUSS] Enable blacklisting feature by default in 3.0

2019-03-29 Thread Ankur Gupta
> On Thu, Mar 28, 2019 at 3:32 PM, Ankur Gupta < > ankur.gu...@cloudera.com.invalid> wrote: > >> Hi all, >> >> This is a follow-on to my PR: https://github.com/apache/spark/pull/24208, >> where I aimed to enable blacklisting for fetch failure by default. Fr

[DISCUSS] Enable blacklisting feature by default in 3.0

2019-03-28 Thread Ankur Gupta
blacklisting timeout config* : spark.scheduler.executorTaskBlacklistTime Thanks, Ankur

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Ankur Gupta
useful for debugging. So a solution than keeps that behavior, but > writes INFO logs to this new sink, would be great. > > If you can come up with a solution to those problems I think this > could be a good feature. > > > On Wed, Aug 22, 2018 at 10:01 AM, Ankur Gupta >

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-22 Thread Ankur Gupta
ead during the shutdown phase of Spark Application. Thanks, Ankur On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido wrote: > I agree with Saisai. You can also configure log4j to append anywhere else > other than the console. Many companies have their system for collecting and > monitoring

Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-21 Thread Ankur Gupta
cation end. This way the logs will be available as part of Yarn Logs. I am also interested in hearing about other ideas that the community may have about this. Or if someone has already solved this problem, then I would like them to contribute their solution to the community. Thanks, Ankur

Re: Spark GraphFrame ConnectedComponents

2017-01-05 Thread Ankur Srivastava
Adding DEV mailing list to see if this is a defect with ConnectedComponent or if they can recommend any solution. Thanks Ankur On Thu, Jan 5, 2017 at 1:10 PM, Ankur Srivastava <ankur.srivast...@gmail.com > wrote: > Yes I did try it out and it choses the local file system as my c

Re: JavaRDD using Reflection

2015-09-14 Thread Ankur Srivastava
he data PairRDD and then invoke map function on that RDD. - Ankur On Mon, Sep 14, 2015 at 12:43 PM, <rachana.srivast...@thomsonreuters.com> wrote: > Thanks so much Ajay and Ankur for your input. > > > > What we are trying to do is following: I am trying to invoke a class > using

Re: Two joins in GraphX Pregel implementation

2015-07-28 Thread Ankur Dave
of Pregel, though it would be worthwhile when combined with other improvements https://github.com/apache/spark/pull/1217. Ankur http://www.ankurdave.com/ pregel-simplify-join.patch Description: Binary data - To unsubscribe, e-mail

Re: GraphX: New graph operator

2015-06-02 Thread Ankur Dave
rely on an implementation detail (vertex replication). Ankur http://www.ankurdave.com/ On Mon, Jun 1, 2015 at 8:54 AM, Tarek Auel tarek.a...@gmail.com wrote: Hello, Someone proposed in a Jira issue to implement new graph operations. Sean Owen recommended to check first with the mailing list

Re: GraphX implementation of ALS?

2015-05-26 Thread Ankur Dave
since then. The performance gap is because the MLlib version implements some ALS-specific optimizations that are hard to do using GraphX, such as storing the edges twice (partitioned by source and by destination) to reduce communication. Ankur http://www.ankurdave.com/ On Tue, May 26, 2015 at 3:36

Re: GraphX vertex partition/location strategy

2015-01-19 Thread Ankur Dave
the replication factor by 1. Ankur http://www.ankurdave.com/ On Mon, Jan 19, 2015 at 12:20 PM, Michael Malak michaelma...@yahoo.com.invalid wrote: Does GraphX make an effort to co-locate vertices onto the same workers as the majority (or even some) of its edges?

Re: [VOTE] Designating maintainers for some Spark components

2014-11-06 Thread Ankur Dave
+1 (binding) Ankur http://www.ankurdave.com/ On Wed, Nov 5, 2014 at 5:31 PM, Matei Zaharia matei.zaha...@gmail.com wrote: I'd like to formally call a [VOTE] on this model, to last 72 hours. The [VOTE] will end on Nov 8, 2014 at 6 PM PST.

Re: PARSING_ERROR from kryo

2014-09-15 Thread Ankur Dave
about that... Ankur [1] https://issues.apache.org/jira/browse/SPARK-3400 - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org

Re: Graphx seems to be broken while Creating a large graph(6B nodes in my case)

2014-08-25 Thread Ankur Dave
I posted the fix on the JIRA ticket (https://issues.apache.org/jira/browse/SPARK-3190). To update the user list, this is indeed an integer overflow problem when summing up the partition sizes. The fix is to use Longs for the sum: https://github.com/apache/spark/pull/2106. Ankur

Re: VertexPartition and ShippableVertexPartition

2014-07-28 Thread Ankur Dave
along with a RoutingTablePartition. After joining the vertices with the edges, the edge partitions cache their adjacent vertices in the mirror cache. They use the VertexPartition for this, which provides only the hash map functionality and not the routing table. Ankur http://www.ankurdave.com/

Re: GraphX graph partitioning strategy

2014-07-25 Thread Ankur Dave
, partitionedEdges) A multipass partitioning algorithm could store its results in the edge attribute, and then you could use the code above to do the partitioning. Ankur http://www.ankurdave.com/ On Wed, Jul 23, 2014 at 11:59 PM, Larry Xiao xia...@sjtu.edu.cn wrote: Hi all, I'm implementing graph

Re: GraphX graph partitioning strategy

2014-07-25 Thread Ankur Dave
= (getTripletPartition(e), e)) .partitionBy(new HashPartitioner(numPartitions)) *.map(pair = Edge(pair._2.srcId, pair._2.dstId, pair._2.attr))* val partitionedGraph = Graph(unpartitionedGraph.vertices, partitionedEdges) Ankur http://www.ankurdave.com/

Re: GraphX can not unpersist edges of old graph?

2014-06-12 Thread Ankur Dave
() and graph.edges.unpersist(). By the way, the memory leak bug with Pregel (SPARK-2025 https://issues.apache.org/jira/browse/SPARK-2025) is fixed in master. Ankur http://www.ankurdave.com/

Re: Removing spark-debugger.md file from master?

2014-06-03 Thread Ankur Dave
I agree, let's go ahead and remove it. Ankur http://www.ankurdave.com/

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

2014-05-27 Thread Ankur Dave
0 OK, I withdraw my downvote. Ankur http://www.ankurdave.com/

Re: Spark 1.0: outerJoinVertices seems to return null for vertex attributes when input was partitioned and vertex attribute type is changed

2014-05-26 Thread Ankur Dave
This is probably due to SPARK-1931https://issues.apache.org/jira/browse/SPARK-1931, which I just fixed in PR #885 https://github.com/apache/spark/pull/885. Is the problem resolved if you use the current Spark master? Ankur http://www.ankurdave.com/