Hi,
I wanted to understand forEachPartition logic. In the code below, I am
assuming the iterator is executing in a distributed fashion.

1. Assuming I have a stream which has timestamp data which is sorted. Will
the stringiterator in foreachPartition process each line in order?

2. Assuming I have a static pool of Kafka connections, where should I get a
connection from a pool to be used to send data to Kafka?

addMTSUnmatched.foreachRDD(
        new Function<JavaRDD<String>, Void>() {
            @Override
            public Void call(JavaRDD<String> stringJavaRDD) throws Exception {
                stringJavaRDD.foreachPartition(

                        new VoidFunction<Iterator<String>>() {
                            @Override
                            public void call(Iterator<String>
stringIterator) throws Exception {
                                while(stringIterator.hasNext()){
                                    String str = stringIterator.next();
                                    if(OnlineUtils.ESFlag) {
                                        OnlineUtils.printToFile(str,
1, type1_outputFile, OnlineUtils.client);
                                    }else{
                                        OnlineUtils.printToFile(str,
1, type1_outputFile);
                                    }
                                }
                            }
                        }
                );
                return null;
            }
        }
);



Thanks

Nipun

Reply via email to