Re: [DISCUSS] Rewrite the Scraper?

Christofer Dutz Sun, 12 Jan 2020 05:49:05 -0800

Hi Julian,

yeah ... I agree that rewriting it would make things easier than trying to fix 
it.


Chris



Am 12.01.20, 14:15 schrieb "Julian Feinauer" <[email protected]>:

    Hi,
    
    to give a little bit of insight.
    The scraper was initially created by me during our Mallorca Hackathon 
waaaay back.
    And some of the logic still works that way... I remember using tons of 
nested Tuple2 and Tuple3...
    It was then "handed" over to Tim who improved it significantly by 
introducing things like "triggered scraping" (one condition is given which is 
checked frequently and a scrape is only done if the condition matches).
    But as this was also pretty close to our "development" process or 
production issues sometimes the code quality was not... well... improved : )
    
    So although the scraper works (and we have it in production still) I share 
the same pain in the stomach as Chris is.
    So I would agree to have a rewriting and probably handle especially edge 
cases more gracefully.
    Good examples for edge cases are:
    - Connections which need pretty long to establish (e.g. via VPN)
    - Low scraping times (< 10ms)
    - Scraping LOTs of Variables from single PLCs at the same time
    
    Oh, and another note is, that Tim kind of specialized it a bit for S7 and 
we did never properly refactor it back to genericity.
    
    So I think if we make a clear concept and just reqrite it, it should be 
pretty easy and get more robust.
    Under the hood its not much more than a Connection Pool and a 
ScheduledExecutor, right?
    
    Best!
    Julian
    
    Am 12.01.20, 13:32 schrieb "Lukas Ott" <[email protected]>:
    
        Hi,
        
        +1 for rewriting the scraper. In my humble opinion the PLC4X project 
still
        aims for multiple language support and the scraper including the
        integration into calcite, kafka and logstash are core capabilities that
        should be supported.
        
        Lukas
        
        Am So., 12. Jan. 2020 um 12:51 Uhr schrieb Christofer Dutz <
        [email protected]>:
        
        > Hi all,
        >
        > for about 7 full days have I been cleaning up the new branch in order 
to
        > port all the other drivers besides the S7 to the new API …
        > This forced me to go through about all modules we have.
        >
        > One module that however causes me to worry is the scraper. It’s a core
        > module we use in the calcite-integration, kafka-connect and the 
logstash
        > module.
        > The last two ones are some that are gaining pretty much traction.
        >
        > However I feel very uncomfortable having dug into the current scraper 
… so
        > I would propose to completely rewrite it. I tried refactoring it for 
1,5
        > days and just recently simply reverted my changes … currently I’m just
        > trying to get things to build again.
        >
        > As Julian told me he and his company have sort of moved away from the
        > scraper to something new … I would like to discuss the alternatives 
with
        > you.
        >
        > Right now for me it feels impossible to provide support if anything 
goes
        > wrong in the scraper.
        >
        > Chris
        >
        >
        >

Re: [DISCUSS] Rewrite the Scraper?

Reply via email to