Hi Ralph, Here is the data. MorphlineSolrSink and MorphlineInterceptor appear 13-14% of the time in this sample set.
org.apache.flume.channel.kafka.KafkaChannel 64% hdfs 38% org.apache.flume.source.kafka.KafkaSource 33% memory 29% file 28% spooldir 26% null 25% org.apache.flume.sink.kafka.KafkaSink 25% Custom JMSSource 18% jms 17% static 14% org.apache.flume.sink.solr.morphline.MorphlineInterceptor 14% ElasticSearchSink 13% org.apache.flume.sink.solr.morphline.MorphlineSolrSink 13% host 13% timestamp 13% avro 11% hbase 11% Let me know if you’d like to drill down further. Tristan From: Ralph Goers <ralph.go...@dslextreme.com> Reply: dev@flume.apache.org <dev@flume.apache.org> Date: 15 January 2022 at 06:48:43 To: dev@flume.apache.org <dev@flume.apache.org> Subject: Re: Morphlines-solr-sink I would like to see the data on the usage. I’m not sure how you would know since Cloudera doesn’t seem to include Flume in its products any more from what I can tell. The kite-morphines project consists of 18 sub-modules plug 4 aggregation modules. That is a heck of a lot of stuff to try to drag in. I would prefer to fork the parts of kite we would need to a new flume-kite repo. It seems that the CVE the reporter mentioned does have a fix. It is available in parquet-avro 1.11.2 and 1.12.2. I was able to swap the new version for the old one even though the groupId has changed. That said, the kite-sdk dependency that includes it is marked as optional, so parquet-avro would be optional as well. So I have no idea if it is even used. In any case, the unit tests all pass with the updated dependency. Ralph > On Jan 14, 2022, at 3:33 PM, Tristan Stevens <tris...@apache.org> wrote: > > -1 from me. > > First wee can’t do that in a patch release, but that’s semantics. > > Both the Morphlines interceptor and the Morphlines-Solr-Sink are components > that are widely used amongst the community. I did some analysis last year > that I’ll dig out and share, but they are two of the most used components > after HDFS sink, Kafka and JMS. > > Whilst I agree it’s sucky that Cloudera aren’t supporting Kite anymore, I > wonder whether we can find a way to bring Morphlines into here, or otherwise > get upstream and fix the bits that need fixing. > > Tristan > > > From: Ralph Goers <ralph.go...@dslextreme.com> > <mailto:ralph.go...@dslextreme.com> > Reply: dev@flume.apache.org <mailto:dev@flume.apache.org> > <dev@flume.apache.org> <mailto:dev@flume.apache.org> > Date: 13 January 2022 at 15:26:12 > To: dev@flume.apache.org <dev@flume.apache.org> <mailto:dev@flume.apache.org> > > Subject: Morphlines-solr-sink > >> While I am not having any trouble building the morphline-solr-sink >> component, it is dependent on the abandoned kite-sdk, which makes its life >> very limited. >> >> In addition, the kite-sdk has a dependency on parquet-avro which, according >> to https://issues.apache.org/jira/browse/FLUME-3403, has vulnerabilities in >> every available release. >> >> Due to these factors I am going to remove the morphline-solr-sink module >> from Flume for the 1.10.0 release. >> >> Ralph