20: eina_tmpstr: add eina_tmpstr_strftime

SHILPA ONKAR SINGH Mon, 28 Sep 2015 23:06:07 -0700


------- Original Message -------
Sender : Amitesh Singh<[email protected]>
Date : Sep 29, 2015 02:03 (GMT+09:00)
Title : Re: [E-devel] [EGIT] [core/efl] master 17/20: eina_tmpstr: add 
eina_tmpstr_strftime


Hi,


On Mon, Sep 28, 2015 at 7:35 PM, Tom Hacohen wrote:

> On 28/09/15 09:53, Tom Hacohen wrote:
> > On 24/09/15 18:53, Cedric BAIL wrote:
> >> On Thu, Sep 24, 2015 at 10:20 AM, Tom Hacohen 
> wrote:
> >>> On 23/09/15 19:56, Cedric BAIL wrote:
> >>>> On Wed, Sep 23, 2015 at 1:18 AM, Tom Hacohen 
> wrote:
> >>>>> On 22/09/15 18:31, Cedric BAIL wrote:
> >>>>>> Le 22 sept. 2015 09:40, "Tom Hacohen" a
> écrit :
> >>>>>>>
> >>>>>>> On 22/09/15 17:32, Cedric BAIL wrote:
> >>>>>>>> Le 22 sept. 2015 02:30, "Daniel Kolesa" a
> écrit :
> >>>>>>>>>
> >>>>>>>>> On Mon, Sep 21, 2015 at 11:24 PM, Shilpa Singh <
> >>>>>> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>> cedric pushed a commit to branch master.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> http://git.enlightenment.org/core/efl.git/commit/?id=abaf29cb768375957c9ee0b64d36034c21c618ea
> >>>>>>>>>>
> >>>>>>>>>> commit abaf29cb768375957c9ee0b64d36034c21c618ea
> >>>>>>>>>> Author: Shilpa Singh 
> >>>>>>>>>> Date:   Mon Sep 21 23:48:16 2015 +0200
> >>>>>>>>>>
> >>>>>>>>>>          eina_tmpstr: add eina_tmpstr_strftime
> >>>>>>>>>
> >>>>>>>>> This API seems awfully arbitrary and I don't really see the
> point.
> >>>>>>>>> Might as well be any other function that prints to strings - are
> you
> >>>>>>>>> gonna add these too? Sounds horrible to me.
> >>>>>>>>
> >>>>>>>> Is it better to have our code cluttered by static buffer and no
> overflow
> >>>>>>>> check ? Obviously not. Yes,I expect we refactor all our buffer
> print
> >>>>>> code
> >>>>>>>> as it is pretty bad right now and may even lead to security issue
> in
> >>>>>> some
> >>>>>>>> case.
> >>>>>>>
> >>>>>>> Just to chime in, I think this doomsday scenario is a bit
> exaggerated.
> >>>>>>> I think having code like:
> >>>>>>>
> >>>>>>> char buf[SOME_LEN];
> >>>>>>> strftime(...buf...);
> >>>>>>> return tmpstr_add(buf);
> >>>>>>>
> >>>>>>> Is a much cleaner solution than adding all of the string functions
> of
> >>>>>>> the world to tmpstr, strbuf, stringshare and etc.
> >>>>>>>
> >>>>>>> I really don't like code duplication, and I think that's where
> Daniel is
> >>>>>>> coming from.
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>
> >>>>>> It's a good example that won't always work and increase at best the
> stack
> >>>>>> size requirement for absolutely no good reason.
> >>>>>
> >>>>> Why wouldn't it always work?
> >>>>> Also if you enclose it in its own sub {}, every compiler will treat
> the
> >>>>> stack sanely, and even without it, I'd assume optimising compilers
> will.
> >>>>
> >>>> The previous implementation used in Elementary was not working with
> >>>> some local where it used way more charactere than expected. Work
> >>>> around solution is to be over generous on memory...
> >>>>
> >>>> But compiler have no way to optimize it at all. How could they know
> >>>> before jumping to a function call, that is going to generate data at
> >>>> runtime, how much that function is going to use to limit the size of
> >>>> the buffer on the stack they give to it. No compiler can optimize
> >>>> that. It's just impossible. Every time we put a 4K buffer on the stack
> >>>> all further function call will force that page allocation, that's how
> >>>> things are.
> >>>
> >>> As implied by my comment about the own sub {}, I was talking about
> >>> optimising the over-stack-usage.
> >>>
> >>> char buf[SOMELEN];
> >>> strftime(buf);
> >>> ret = tmpstr_add(buf);
> >>> // buf is not used anymore, stack can be cleared.
> >>>
> >>> That's what I meant. It's true though that compilers can't assume we
> >>> haven't kept the buf pointer and used it somewhere else (which we are
> >>> allowed to until the function ended). Though that's not what you were
> >>> talking about.
> >>
> >> That is another case that a compiler can not optimize. It has no clue
> >> that buf is not stored somewhere globally, or that ret doesn't point
> >> to somewhere in it. That's why stack size vary depending on the block
> >> you are in. Also stack never shrink, so once you get those 4K on it,
> >> they will stay there.
> >
> > As I said, this case is unlikely, but the other case I mentioned does
> > allow the compiler to optimise:
> >
> > int foo(void)
> > {
> >      const char *bla;
> >      {
> >         char buf[LONGBUF];
> >         strftime(...);
> >         bla = eina_tmpstr_add;
> >      }
> >      // rest of code
> >      printf("%s\n", bla);
> >      return 0;
> > }
> >
> > this can even be simplified into a macro that accepts a parameter saying
> > which function to use to duplicate the string. strdup, eina_tmpstr_add,
> > eina_stringshare_add or whatever.
> >
> >>
> >>> However, looking at the tmpstr code, the solution there is no better.
> It
> >>> reallocs all the time and implements a whole mechanism there. Wouldn't
> >>> it be much better to create a eina_strftime_alloc (bad name) that
> >>> returns an allocated string of the right size and that can be used in
> >>> strbuf (strbuf_string_steal), tmpstr (add a string steal) and etc?
> >>> That's writing the function once, and having one API for all of eina,
> >>> instead of the proposed method which will add plenty and duplicate all
> >>> around.
> >>
> >> strftime doesn't tell you the size of the string you need. It only
> >> tells you if it failed to put the generated string inside the buffer
> >> you gave it. There is no way around doing a realloc loop with that
> >> API.
> >>
> >> As for writing the function once, it is the case right now. If I was
> >> to add that function to strbuf or stringshare, then I would obviously
> >> find a way to share the code. If not by just calling the tmpstr
> >> function. In general for Eina, we do not add a function until we have
> >> at least 2 users in our code base. So for now, I do not plan to add
> >> strftime to any other part of Eina, as we do not have the need for it.
> >
> > You are adding eina_tmpstr_add() without any users in the code base...
> > Anyhow, I think it would be better to add API in a common place, as we
> > know it'll be used more, than to add a specific API and then duplicating
> > it to a common place.
> >
> >>
> >>>>>> I see no reason to have bad code like that just for fewer line of
> code.
> >>>>>
> >>>>> It's not a fewer lines of code, it's fewer API and a lot less lines
> of
> >>>>> code. Also, adding such API implies to developers that they should
> >>>>> expect such API, so you are just feeding the evil loop.
> >>>>
> >>>> And they should use it, as the way they usually manually do it lead to
> >>>> bug. This is actually reducing our code base and also reduce our
> >>>> number of actual bugs.
> >>>
> >>> Have a general purpose function that does that (as above), no manual
> >>> work. Still reduces the codebase, much cleaner.
> >>
> >> eina_strftime is I guess what you are trying to say here. Basically
> >> not making it a tmpstr and just use free to get rid of it. I guess
> >> that's an acceptable solution.
> >
> > Yes, it's exactly what I'm saying. Let's revert this patch and work on
> > an eina_strftime instead.
> >
> >>
> >>>>> Btw, I was thinking about the whole tmpstr solution (which you know
> I am
> >>>>> not a fan off) last night, and I came up with two solutions I think
> are
> >>>>> better, and anyhow, are massively more efficient.
> >>>>> The first involves creating new API (but it's still less API than
> >>>>> tmpstr) for genlist and etc that accept a struct instead of a string.
> >>>>> This way we can add more info that will hint who should free it.
> >>>>> This is the more involved, clean solution, that we should maybe adopt
> >>>>> for EFL 2.0.
> >>>>
> >>>> So your solution is to pass a struct where there will be one byte
> >>>> dedicated to say if that string need to be freed ? I think that's far
> >>>> from elegant and fully error prone with no way for the compiler to
> >>>> help on that.
> >>>
> >>> Maybe we'll have more metadata in the future. The struct will be
> created
> >>> automatically with an helper function, so not error prone at all.
> >>
> >> I will have to see it, but I have yet to see what would all those
> >> metadata be. For the moment, I don't see how this could be an
> >> improvement over current code base.
> >
> > I don't think eina_tmpstr is a good solution that will scale nicely,
> > these are just alternative ideas.
> >
> >>
> >>>>> The easy hacky solution, which is a replacement to tmpstr (which is
> also
> >>>>> a hack) and is much simpler and more efficient, is to use this
> scheme:
> >>>>> For static strings you just return them:
> >>>>> return "foo";
> >>>>> for dynamic you return this:
> >>>>> return eina_tomstr_add("bar");
> >>>>>
> >>>>> You can free both with eina_tomstr_free(buf);
> >>>>>
> >>>>> The implementation is where the magic is at, in memory, it looks
> like this:
> >>>>> "\x02bar", the first char being the STX (start of text) char. We just
> >>>>> use eina_tomstr_str_get(buf) whenever we want to use it. It just
> returns
> >>>>> "buf + 1" if the first char is STX. It's infinitely faster and more
> >>>>> memory efficient than tmpstr, I think also much cleaner. The only bad
> >>>>> thing is that you need to use the str function, but I think that's a
> >>>>> reasonable compromise.
> >>>>
> >>>> That would obviously lead to random bug. If you free a buf that has
> >>>> not been allocated by your tomstr, it will lead to an off by one
> >>>> access, if you are unlucky you will fall on a non allocated page. If
> >>>> you are really unlucky it will fall on an existing data that does have
> >>>> the expected value and call free on a static buffer. Also it won't be
> >>>> that much faster or slower as the current implementation walk a short
> >>>> list that should definitively be only a few elements long.
> >>>
> >>> How so? As I said, genlist gets "\x02bar", so genlist can safely check
> >>> the *first* (not -1) character of the string. It can then pass the
> >>> result of string_get() to where it needs actual text, but it keeps the
> >>> real reference for future processing. I don't see why or how we'll have
> >>> text that starts with "\0x02". If we want to make it even more rare, we
> >>> can use two bytes. Still much more efficient than tmpstr. Why would the
> >>> the current implementation be a short list? You want this to be used
> all
> >>> around with dynamic strings, this list will grow as usage will grow.
> >>
> >> Oh, so you never let the API user know that it is indeed a string and
> >> allways make it a hidden structure. That would make the code base
> >> quite hugly with all those str_get arround for little benefit. Also
> >> having the risk to have a confused API with random behavior sounds bad
> >> to me. If you want to hide information in a less problematic way, you
> >> can hide easily one bit in the pointer itself. As your API wouldn't
> >> autorize the use of the string directly anyway, having the least
> >> significant bit used to say if the pointer is to be freed or not will
> >> be clearly less error prone. Still it's hugly as hell and I think
> >> should be avoided in Efl code base (That's not the first time we have
> >> ruled against using that trick in Eina code base, and I am still
> >> against this kind of trick).
> >
> > I think the proposed mechanism is much safer than encoding in the
> > pointer. Encoding in the pointer will result in bad memory access and
> > thus a crash. My suggestion can cause some issues with comparison,
> > that's it. You can create mechanisms that make it completely error free.
> > For example, you could save it internally (when you do) as a type that
> > can't be cast to a string, and then you'd have to use the functions
> > otherwise it wouldn't compile. This is just a clean way to maintain the
> > string ABI but still making it safe internally.
> >
> >>
> >> As for why it should stay a short list it is due to what tmpstr is
> >> used for. Code pattern for it is of the form :
> >>
> >> tmp = function_returning_tmpstr();
> >> function_call(tmp);
> >> tmpstr_del(tmp);
> >>
> >> So either we have a massive recursion (which is wrong and need to be
> >> fixed), or we are leaking tmpstr. tmpstr is not intended to be used
> >> for outside of a block and should not be in a structure for example.
> >> As I said, tmpstr alive at any point in time should be pretty low and
> >> walking that said list never be a performance issue.
> >>
> >
> > True. If we only use it in genlist (and similar), this is the usage
> > pattern and it won't be the end of the world. However, as I said, I
> > proposed above a solution I find more clean and more general purpose.
> > Can be used in many places and saved for the long term. This is a bit
> > diverging from the purpose of this thread though (my bad). It's more
> > important that we deal with the issue at hand before we freeze.
> >
> >
> >
> >
> > Conclusion: let's add eina_strftime and eina_tmpstr_manage_new and
> > revert this before the freeze.
>
> Btw, the rewrite should also fix the warnings.
>
> src/lib/elm_calendar.c:175:11: warning: return discards ‘const’
> qualifier from pointer target type [-Wdiscarded-qualifiers]
>      return eina_tmpstr_strftime(E_("%B %Y"), selected_time);
>             ^
>

There was a quick patch to fix the warnings. But its incorrect. :/
https://phab.enlightenment.org/D3113

@tasn, @cedric
I will add eina_strftime and eina_tmpstr_manage_new as discussed and 
accordingly revert my patch
https://phab.enlightenment.org/D3048 
will update: https://phab.enlightenment.org/D3087(example)
and then update elm_calendar code accordingly. 
The above warning patch, I did not have a choice but to typecast due to 
signature of format_func, but hopefully with above changes,
I wont have to typecast I will update with corrected code. 
@tasn, sorry to use same email client, will change soon


> --
> Tom.
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
>
------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] [EGIT] [core/efl] master 17/20: eina_tmpstr: add eina_tmpstr_strftime

Reply via email to