Hi M. Mothilal, Attached is a template that filters based on a Twitter ID (my id, which I retrieved from [1]). As I am sure you already know, you need to change the tokens and keys to match your application. After setting up the configuration, I went to Twitter and just re-tweeted a tweet and moments later it appeared in my NiFi flow. The different actions that should cause data to appear are identified here [2].
Hopefully this helps. [1] http://gettwitterid.com/?user_name=itmdata&submit=GET+USER+ID [2] https://dev.twitter.com/streaming/overview/request-parameters#follow Thanks, Andrew On Tue, Nov 8, 2016 at 6:59 PM, Mothilal M <moth...@gmail.com> wrote: > Hi Bryan Bende, > > Greetings. I did try filter endpoint and it certainly needs numeric ID and > not twitter ID. Processor shows error if twitter ID - so I had converted it > to respective numeric. Please advice if there is any sample worked on this > use case - filtered tweets based on specific IDs. > > Warm Regards, > M. Mothilal > > On Tue, Nov 8, 2016 at 2:45 PM, Mothilal M <moth...@gmail.com> wrote: > > > Hi Joe, > > > > Greetings and appreciate your reply. Totally agree with your opinion on > > comparisons. > > > > Look forward for other developers to comment on suggestions of getTwitter > > and getHTTP processor issues for twitter and facebook integration. > > > > Warm Regards, > > M. Mothilal > > > > On Tue, Nov 8, 2016 at 2:21 PM, Mothilal M <moth...@gmail.com> wrote: > > > >> Greetings Joe, > >> > >> Glad to connect with you from your reply in Apache NiFi community. I am > >> integrating NiFi with social media - twitter and Facebook to achieve > >> analysis but I am struck at a point. Request your assistance in solving > the > >> same. > >> > >> getTwitter processor - i am not able to get tweets filtered by twitter > >> IDs. (IDs to follow field value is not working as expected. I did get > the > >> numeric value of twitter IDs and input in the value as comma separated > but > >> no results). Please advice as I look to filter my tweets on particular > IDs > >> and also by particular location by cities like Toronto, calgary. > >> > >> getHTTP processor for facebook integration works fine but i am unable to > >> fetch data from my personal facebook ID. I tried accessing the Graph API > >> and it mentions user permissions are not granted but I did provide the > >> essential permissions in app settings. Please advice if you could help > out > >> in setting the facebook app permissions to ensure I can fetch > user_status, > >> user_location fields. On success, I would need to venture out few > options > >> further. > >> > >> General question - How could Apache NiFi be useful in replacing Splunk. > I > >> understand most of the current companies have a paid version of Splunk > and > >> NiFi is an open-source but please let me know features which are key in > >> replacing Splunk. > >> > >> I appreciate your time and would be more helpful if you could let me > know > >> your comfortable timing for a technical call discussion. I am available > and > >> reachable on +1(647)641-0910. Look forwardly for your reply. > >> > >> Warm Regards, > >> M. Mothilal > >> > >> ------------------------------------------------------------ > >> ------------------------------ > >> > >> Hello, > >> > >> I think as a community we're all quite happy to help address questions > >> or ideas you might have for Apache NiFi. Please feel free to ask your > >> questions here and if they're more from the perspective of a user > >> please use the us...@nifi.apache.org list. > >> > >> For the technical questions it is probably best to align them with > >> some portion of the overview, administration, user, best practice, or > >> developer guide so we can use that as a basis for further discussion. > >> You can find those here: https://nifi.apache.org/docs.html > >> > >> Thanks > >> Joe > >> > >> > >> ---------- Forwarded message ---------- > >> From: Mothilal M <moth...@gmail.com> > >> Date: Tue, Nov 8, 2016 at 1:25 PM > >> Subject: Need Information - Apache NiFi > >> To: dev@nifi.apache.org > >> > >> > >> Hi Apache NiFi team, > >> > >> Greetings. I did my hands on experience with multiple tutorials around > >> sentimental analysis integrating social media like twitter and facebook > and > >> also log analysis. Could you help me understand in what features Apache > >> NiFi would act as an alternative to Splunk. > >> > >> I do have couple of technical clarifications in Apache NiFi - would it > be > >> possible to have a initial technical call to clarify my questions. > >> > >> Warm Regards, > >> M. Mothilal > >> Cell : +1(647)641-0910. > >> > >> > > > -- Thanks, Andrew Subscribe to my book: Streaming Data <http://manning.com/psaltis> <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306> twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>
<?xml version="1.0" ?> <template encoding-version="1.0"> <description></description> <groupId>f5ca9391-0f9a-4e95-839c-2ff3461ef37f</groupId> <name>FilterTwitterByID</name> <snippet> <connections> <id>8554ca84-cc2a-4ad2-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold> <backPressureObjectThreshold>0</backPressureObjectThreshold> <destination> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>d0e55145-6505-48ec-0000-000000000000</id> <type>PROCESSOR</type> </destination> <flowFileExpiration>0 sec</flowFileExpiration> <labelIndex>1</labelIndex> <name></name> <selectedRelationships>matched</selectedRelationships> <source> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>3ee301f8-d1b6-4d09-0000-000000000000</id> <type>PROCESSOR</type> </source> <zIndex>0</zIndex> </connections> <connections> <id>865fcabf-5b5c-4e2e-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold> <backPressureObjectThreshold>0</backPressureObjectThreshold> <destination> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>d3c85c30-7a66-4351-0000-000000000000</id> <type>PROCESSOR</type> </destination> <flowFileExpiration>0 sec</flowFileExpiration> <labelIndex>1</labelIndex> <name></name> <selectedRelationships>tweet</selectedRelationships> <source> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>d0e55145-6505-48ec-0000-000000000000</id> <type>PROCESSOR</type> </source> <zIndex>0</zIndex> </connections> <connections> <id>bb162957-16b6-4818-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold> <backPressureObjectThreshold>0</backPressureObjectThreshold> <destination> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>63500cda-60fc-433e-0000-000000000000</id> <type>PROCESSOR</type> </destination> <flowFileExpiration>0 sec</flowFileExpiration> <labelIndex>1</labelIndex> <name></name> <selectedRelationships>success</selectedRelationships> <source> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>d3c85c30-7a66-4351-0000-000000000000</id> <type>PROCESSOR</type> </source> <zIndex>0</zIndex> </connections> <connections> <id>cbd79ce4-aa7f-47d5-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold> <backPressureObjectThreshold>0</backPressureObjectThreshold> <destination> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>3ee301f8-d1b6-4d09-0000-000000000000</id> <type>PROCESSOR</type> </destination> <flowFileExpiration>0 sec</flowFileExpiration> <labelIndex>1</labelIndex> <name></name> <selectedRelationships>success</selectedRelationships> <source> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>2c9405dd-f5cf-41eb-0000-000000000000</id> <type>PROCESSOR</type> </source> <zIndex>0</zIndex> </connections> <connections> <id>211507c9-92e6-49ea-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold> <backPressureObjectThreshold>0</backPressureObjectThreshold> <destination> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>6f28e92f-9486-45ad-0000-000000000000</id> <type>PROCESSOR</type> </destination> <flowFileExpiration>0 sec</flowFileExpiration> <labelIndex>1</labelIndex> <name></name> <selectedRelationships>merged</selectedRelationships> <source> <groupId>f5ca9391-0f9a-4e95-0000-000000000000</groupId> <id>63500cda-60fc-433e-0000-000000000000</id> <type>PROCESSOR</type> </source> <zIndex>0</zIndex> </connections> <processors> <id>d0e55145-6505-48ec-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>14.516555786132812</x> <y>646.3788655853272</y> </position> <config> <bulletinLevel>WARN</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>2</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Routing Strategy</key> <value> <name>Routing Strategy</name> </value> </entry> <entry> <key>tweet</key> <value> <name>tweet</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Routing Strategy</key> <value>Route to Property name</value> </entry> <entry> <key>tweet</key> <value>${twitter.msg:isEmpty():not()}</value> </entry> </properties> <runDurationMillis>25</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>Find only Tweets</name> <relationships> <autoTerminate>false</autoTerminate> <name>tweet</name> </relationships> <relationships> <autoTerminate>true</autoTerminate> <name>unmatched</name> </relationships> <style></style> <type>org.apache.nifi.processors.standard.RouteOnAttribute</type> </processors> <processors> <id>d3c85c30-7a66-4351-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>687.4362933187049</x> <y>533.1646658637524</y> </position> <config> <bulletinLevel>WARN</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Regular Expression</key> <value> <name>Regular Expression</name> </value> </entry> <entry> <key>Replacement Value</key> <value> <name>Replacement Value</name> </value> </entry> <entry> <key>Character Set</key> <value> <name>Character Set</name> </value> </entry> <entry> <key>Maximum Buffer Size</key> <value> <name>Maximum Buffer Size</name> </value> </entry> <entry> <key>Replacement Strategy</key> <value> <name>Replacement Strategy</name> </value> </entry> <entry> <key>Evaluation Mode</key> <value> <name>Evaluation Mode</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Regular Expression</key> <value>(?s:^(.*)$)</value> </entry> <entry> <key>Replacement Value</key> <value>${twitter.tweet_id}|${twitter.unixtime}|${twitter.time}|${twitter.handle}|${twitter.msg:replace('$',''):replace('\n','')}|$1</value> </entry> <entry> <key>Character Set</key> <value>UTF-8</value> </entry> <entry> <key>Maximum Buffer Size</key> <value>1 MB</value> </entry> <entry> <key>Replacement Strategy</key> <value>Regex Replace</value> </entry> <entry> <key>Evaluation Mode</key> <value>Entire text</value> </entry> </properties> <runDurationMillis>0</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>ReplaceText</name> <relationships> <autoTerminate>true</autoTerminate> <name>failure</name> </relationships> <relationships> <autoTerminate>false</autoTerminate> <name>success</name> </relationships> <style></style> <type>org.apache.nifi.processors.standard.ReplaceText</type> </processors> <processors> <id>2c9405dd-f5cf-41eb-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>13.447357177734375</x> <y>0.0</y> </position> <config> <bulletinLevel>WARN</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Twitter Endpoint</key> <value> <name>Twitter Endpoint</name> </value> </entry> <entry> <key>Consumer Key</key> <value> <name>Consumer Key</name> </value> </entry> <entry> <key>Consumer Secret</key> <value> <name>Consumer Secret</name> </value> </entry> <entry> <key>Access Token</key> <value> <name>Access Token</name> </value> </entry> <entry> <key>Access Token Secret</key> <value> <name>Access Token Secret</name> </value> </entry> <entry> <key>Languages</key> <value> <name>Languages</name> </value> </entry> <entry> <key>Terms to Filter On</key> <value> <name>Terms to Filter On</name> </value> </entry> <entry> <key>IDs to Follow</key> <value> <name>IDs to Follow</name> </value> </entry> <entry> <key>Locations to Filter On</key> <value> <name>Locations to Filter On</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Twitter Endpoint</key> <value>Filter Endpoint</value> </entry> <entry> <key>Consumer Key</key> <value>VoofeZxp0huKDXbVsurFWwW2K</value> </entry> <entry> <key>Consumer Secret</key> </entry> <entry> <key>Access Token</key> <value>100607628-FOQ7Uu6GVkqHLWmQQBHC5ewqcRpAI9SR21ERNnmD</value> </entry> <entry> <key>Access Token Secret</key> </entry> <entry> <key>Languages</key> </entry> <entry> <key>Terms to Filter On</key> </entry> <entry> <key>IDs to Follow</key> <value>100607628</value> </entry> <entry> <key>Locations to Filter On</key> </entry> </properties> <runDurationMillis>0</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>Grab Garden Hose</name> <relationships> <autoTerminate>false</autoTerminate> <name>success</name> </relationships> <style></style> <type>org.apache.nifi.processors.twitter.GetTwitter</type> </processors> <processors> <id>3ee301f8-d1b6-4d09-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>0.0</x> <y>332.28023612976074</y> </position> <config> <bulletinLevel>ERROR</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>4</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Destination</key> <value> <name>Destination</name> </value> </entry> <entry> <key>Return Type</key> <value> <name>Return Type</name> </value> </entry> <entry> <key>Path Not Found Behavior</key> <value> <name>Path Not Found Behavior</name> </value> </entry> <entry> <key>Null Value Representation</key> <value> <name>Null Value Representation</name> </value> </entry> <entry> <key>language</key> <value> <name>language</name> </value> </entry> <entry> <key>twitter.handle</key> <value> <name>twitter.handle</name> </value> </entry> <entry> <key>twitter.hashtags</key> <value> <name>twitter.hashtags</name> </value> </entry> <entry> <key>twitter.msg</key> <value> <name>twitter.msg</name> </value> </entry> <entry> <key>twitter.time</key> <value> <name>twitter.time</name> </value> </entry> <entry> <key>twitter.tweet_id</key> <value> <name>twitter.tweet_id</name> </value> </entry> <entry> <key>twitter.unixtime</key> <value> <name>twitter.unixtime</name> </value> </entry> <entry> <key>twitter.user</key> <value> <name>twitter.user</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Destination</key> <value>flowfile-attribute</value> </entry> <entry> <key>Return Type</key> <value>auto-detect</value> </entry> <entry> <key>Path Not Found Behavior</key> <value>ignore</value> </entry> <entry> <key>Null Value Representation</key> <value>empty string</value> </entry> <entry> <key>language</key> <value>$.lang</value> </entry> <entry> <key>twitter.handle</key> <value>$.user.screen_name</value> </entry> <entry> <key>twitter.hashtags</key> <value>$.entities.hashtags[0].text</value> </entry> <entry> <key>twitter.msg</key> <value>$.text</value> </entry> <entry> <key>twitter.time</key> <value>$.created_at</value> </entry> <entry> <key>twitter.tweet_id</key> <value>$.id</value> </entry> <entry> <key>twitter.unixtime</key> <value>$.timestamp_ms</value> </entry> <entry> <key>twitter.user</key> <value>$.user.name</value> </entry> </properties> <runDurationMillis>25</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>Pull Key Attributes</name> <relationships> <autoTerminate>true</autoTerminate> <name>failure</name> </relationships> <relationships> <autoTerminate>false</autoTerminate> <name>matched</name> </relationships> <relationships> <autoTerminate>true</autoTerminate> <name>unmatched</name> </relationships> <style></style> <type>org.apache.nifi.processors.standard.EvaluateJsonPath</type> </processors> <processors> <id>63500cda-60fc-433e-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>1502.552418823797</x> <y>471.72640043737675</y> </position> <config> <bulletinLevel>WARN</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Merge Strategy</key> <value> <name>Merge Strategy</name> </value> </entry> <entry> <key>Merge Format</key> <value> <name>Merge Format</name> </value> </entry> <entry> <key>Attribute Strategy</key> <value> <name>Attribute Strategy</name> </value> </entry> <entry> <key>Correlation Attribute Name</key> <value> <name>Correlation Attribute Name</name> </value> </entry> <entry> <key>Minimum Number of Entries</key> <value> <name>Minimum Number of Entries</name> </value> </entry> <entry> <key>Maximum Number of Entries</key> <value> <name>Maximum Number of Entries</name> </value> </entry> <entry> <key>Minimum Group Size</key> <value> <name>Minimum Group Size</name> </value> </entry> <entry> <key>Maximum Group Size</key> <value> <name>Maximum Group Size</name> </value> </entry> <entry> <key>Max Bin Age</key> <value> <name>Max Bin Age</name> </value> </entry> <entry> <key>Maximum number of Bins</key> <value> <name>Maximum number of Bins</name> </value> </entry> <entry> <key>Delimiter Strategy</key> <value> <name>Delimiter Strategy</name> </value> </entry> <entry> <key>Header File</key> <value> <name>Header File</name> </value> </entry> <entry> <key>Footer File</key> <value> <name>Footer File</name> </value> </entry> <entry> <key>Demarcator File</key> <value> <name>Demarcator File</name> </value> </entry> <entry> <key>Compression Level</key> <value> <name>Compression Level</name> </value> </entry> <entry> <key>Keep Path</key> <value> <name>Keep Path</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Merge Strategy</key> <value>Bin-Packing Algorithm</value> </entry> <entry> <key>Merge Format</key> <value>Binary Concatenation</value> </entry> <entry> <key>Attribute Strategy</key> <value>Keep Only Common Attributes</value> </entry> <entry> <key>Correlation Attribute Name</key> </entry> <entry> <key>Minimum Number of Entries</key> <value>20</value> </entry> <entry> <key>Maximum Number of Entries</key> <value>1000</value> </entry> <entry> <key>Minimum Group Size</key> <value>0 B</value> </entry> <entry> <key>Maximum Group Size</key> </entry> <entry> <key>Max Bin Age</key> <value>120 seconds</value> </entry> <entry> <key>Maximum number of Bins</key> <value>100</value> </entry> <entry> <key>Delimiter Strategy</key> <value>Filename</value> </entry> <entry> <key>Header File</key> </entry> <entry> <key>Footer File</key> </entry> <entry> <key>Demarcator File</key> </entry> <entry> <key>Compression Level</key> <value>1</value> </entry> <entry> <key>Keep Path</key> <value>false</value> </entry> </properties> <runDurationMillis>0</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>MergeContent</name> <relationships> <autoTerminate>true</autoTerminate> <name>failure</name> </relationships> <relationships> <autoTerminate>false</autoTerminate> <name>merged</name> </relationships> <relationships> <autoTerminate>true</autoTerminate> <name>original</name> </relationships> <style></style> <type>org.apache.nifi.processors.standard.MergeContent</type> </processors> <processors> <id>6f28e92f-9486-45ad-0000-000000000000</id> <parentGroupId>f5ca9391-0f9a-4e95-0000-000000000000</parentGroupId> <position> <x>855.3241012373969</x> <y>717.2001911008986</y> </position> <config> <bulletinLevel>WARN</bulletinLevel> <comments></comments> <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount> <descriptors> <entry> <key>Directory</key> <value> <name>Directory</name> </value> </entry> <entry> <key>Conflict Resolution Strategy</key> <value> <name>Conflict Resolution Strategy</name> </value> </entry> <entry> <key>Create Missing Directories</key> <value> <name>Create Missing Directories</name> </value> </entry> <entry> <key>Maximum File Count</key> <value> <name>Maximum File Count</name> </value> </entry> <entry> <key>Last Modified Time</key> <value> <name>Last Modified Time</name> </value> </entry> <entry> <key>Permissions</key> <value> <name>Permissions</name> </value> </entry> <entry> <key>Owner</key> <value> <name>Owner</name> </value> </entry> <entry> <key>Group</key> <value> <name>Group</name> </value> </entry> </descriptors> <lossTolerant>false</lossTolerant> <penaltyDuration>30 sec</penaltyDuration> <properties> <entry> <key>Directory</key> <value>/tmp/tweets</value> </entry> <entry> <key>Conflict Resolution Strategy</key> <value>fail</value> </entry> <entry> <key>Create Missing Directories</key> <value>true</value> </entry> <entry> <key>Maximum File Count</key> </entry> <entry> <key>Last Modified Time</key> </entry> <entry> <key>Permissions</key> </entry> <entry> <key>Owner</key> </entry> <entry> <key>Group</key> </entry> </properties> <runDurationMillis>0</runDurationMillis> <schedulingPeriod>0 sec</schedulingPeriod> <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy> <yieldDuration>1 sec</yieldDuration> </config> <name>PutFile</name> <relationships> <autoTerminate>true</autoTerminate> <name>failure</name> </relationships> <relationships> <autoTerminate>true</autoTerminate> <name>success</name> </relationships> <style></style> <type>org.apache.nifi.processors.standard.PutFile</type> </processors> </snippet> <timestamp>11/09/2016 11:36:15 EST</timestamp> </template>