Hi,

I have a bit different thoughts about the conflicts of the name of a new
note created. In a multiuser environment, AFAIK, most teams and companies,
generally, use a prefix for the group policy internally. In my case,
user/{user_id}/{notebook_name_they_want}.zpln. In this case, naming
conflicts rarely happen. And it will be stored under a specific folder. If
someone needed two different same named notes in the same directory, I
might not be appropriate. WDYT?

JL

On Fri, Aug 31, 2018 at 4:44 AM, andreas.we...@gmail.com <
andreas.we...@gmail.com> wrote:

> another reason for keeping noteId is uniqueness in case of multi-user
> environments. In that case users have separate zeppelin workspaces, which
> is something we are using in production: see ZEPPELIN_NOTEBOOK_PUBLIC=false
> in the doc [1]. In that case users might be very confused when they can not
> create notebooks with a name that already exists, but they most likely
> don't see (yet).
>
> So I like the proposal {note_name}_{note_id}.zpln. where note_name could
> contains folders, e.g. folder_1/mynote_abcd.zpln. Even though I like
> {note_name}.{note_id}.zpln (dot in between note_name and note_id) even
> better :-)
>
> Regards
> Andreas
>
>
> [1] http://zeppelin.apache.org/docs/0.8.0/setup/security/
> notebook_authorization.html#separate-notebook-workspaces-public-vs-private
>
> On 2018/08/18 08:42:44, Jeff Zhang <zjf...@gmail.com> wrote:
> > BTW, I also prefer to use note name as identify of note if the issue I
> > mentioned before is acceptable for most of users.
> >
> >
> >
> > Jeff Zhang <zjf...@gmail.com>于2018年8月18日周六 下午4:40写道:
> >
> > >
> > > I am afraid we can not remove noteId, as noteId is the unique
> identifier
> > > of note and is immutable which is used in a lot places, such as
> paragraph
> > > share and rest api.
> > > If we use note name as note id then it may break user's app if note
> name
> > > is changed
> > >
> > >
> > > Jongyoul Lee <jongy...@gmail.com>于2018年8月18日周六 下午2:33写道:
> > >
> > >> Hi, thanks for this kind of discussion.
> > >>
> > >> About noteId, How about changing note id to note name? AFAIK, Note id
> is
> > >> just an identifier and we can set any value to it.
> > >>
> > >> There’re two potential problems. We should be more careful to handle
> note
> > >> id as it could have very various type of characters. And Second, in
> case
> > >> where someone changes a note name, those who are seeing and updating
> the
> > >> same note wouldn’t access that note. We could handle it by using
> websockets.
> > >>
> > >> WDYT?
> > >>
> > >> On Tue, 14 Aug 2018 at 6:14 PM Jeff Zhang <zjf...@gmail.com> wrote:
> > >>
> > >>> >>> But I’m still not comfortable with note ids in the name of the
> > >>> notebook itself.  Those names would look ugly if you shared your
> notebooks
> > >>> on github for example.  You don’t see Jupyter notebooks with names
> like
> > >>> that. If you have to keep the note ids with the notebooks could you
> not
> > >>> simply put the note id at the top of the notebook as Ruslan
> suggested? Then
> > >>> you’d only have to read the first line of each notebook.
> > >>>
> > >>> I know putting note_id in the note file name is not so elegant, but
> this
> > >>> is what we have to compromise to keep compatibility as we use noteId
> to
> > >>> uniquely identify note right now. And I don't think putting noteId
> in the
> > >>> top first line of note would help much. We still have to read note
> files
> > >>> which take much more time than just read the file names via file
> system.
> > >>>
> > >>> Regarding the readability of note file name, I think it won't affect
> > >>> much. E.g. This is the note book file name like:  *My Project/My
> Spark
> > >>> Tutorial Note_2A94M5J1Z.zpln*
> > >>> What user see in notebook menu is still *My Project/My Spark
> Tutorial* *Note
> > >>> *which is no difference from what we see now.
> > >>>
> > >>> And thanks again for the feedback and comments, I am so glad to see
> so
> > >>> many discussion in community.
> > >>>
> > >>>
> > >>>
> > >>> Partridge, Lucas (GE Aviation) <lucas.partri...@ge.com>于2018年8月14日周二
> > >>> 下午4:29写道:
> > >>>
> > >>>> I agree you’re inviting consistency issues if you maintained a
> separate
> > >>>> note id-to-note name mapping file.
> > >>>>
> > >>>>
> > >>>>
> > >>>> But I’m still not comfortable with note ids in the name of the
> notebook
> > >>>> itself.  Those names would look ugly if you shared your notebooks
> on github
> > >>>> for example.  You don’t see Jupyter notebooks with names like
> that.  If you
> > >>>> have to keep the note ids with the notebooks could you not simply
> put the
> > >>>> note id at the top of the notebook as Ruslan suggested? Then you’d
> only
> > >>>> have to read the first line of each notebook.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Presumably if you copied the notebooks to another Zeppelin server
> they
> > >>>> would be restored with the same note ids there too? And hopefully
> there
> > >>>> would be no id clash with notebooks already on that server…
> > >>>>
> > >>>>
> > >>>>
> > >>>> *From:* Jeff Zhang <zjf...@gmail.com>
> > >>>> *Sent:* 14 August 2018 03:49
> > >>>> *To:* users@zeppelin.apache.org
> > >>>>
> > >>>>
> > >>>> *Subject:* EXT: Re: [DISCUSS] ZEPPELIN-2619. Save note in
> [Title].zpln
> > >>>> instead of [NOTEID]/note.json
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Thanks for the discussion.
> > >>>>
> > >>>> >>> I'm afraid about non-latin symbols in folder and note name. And
> > >>>> what about hieroglyphs?
> > >>>>
> > >>>> AFAIK, linux allow all the characters to be file name except `\0`
> and
> > >>>> '/'.  I can create file name with Chinese character in linux, I
> guess you
> > >>>> can use Russian as well.
> > >>>>
> > >>>>
> > >>>>
> > >>>> >>> If I understand correctly, this is being done solely to speed up
> > >>>> loading list of notebooks? What if a list of notebook names, their
> ids,
> > >>>> folder structure, etc can be *cached* in a separate small json
> file? Or
> > >>>> perhaps in a small embedded key-value store, like www.mapdb.org
> would
> > >>>> do? Just thinking out loud. This would require a way to lazily
> re-sync the
> > >>>> cache.
> > >>>>
> > >>>>
> > >>>>
> > >>>> This not only to speed up the loading but also make the system
> > >>>> architecture easy to maintain. Because for now we have to build the
> folder
> > >>>> structure of notes in memory, many code in zeppelin is doing this
> > >>>> (Personally I don't think we need any code for this function if we
> could
> > >>>> get the folder structure from the note file storage system). Use
> another
> > >>>> storage to keep the mapping of note name and note id will bring
> another
> > >>>> classic problem of distributed system: consistency. How do we make
> sure the
> > >>>> consistency between the real note file and this mapping component.
> If we
> > >>>> create/rename/remove note, we have to both update the notebook repo
> and the
> > >>>> mapping storage. Any bug in code would bring inconsistency issue
> based on
> > >>>> my experience.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Ruslan Dautkhanov <dautkha...@gmail.com>于2018年8月14日周二 上午3:58写道:
> > >>>>
> > >>>> Thanks for bringing this up for discussion. My 2 cents below.
> > >>>>
> > >>>>
> > >>>>
> > >>>> I am with Maksim and Felix on concerns with special characters now
> > >>>> allowed in notebook names, and also concerns with different
> charsets.
> > >>>> Russian language, for example, most commonly use iso-8859-5,
> koi-8r/u,
> > >>>> windows-1251 charsets etc. This seems like will bring whole new set
> of
> > >>>> localization issues.
> > >>>>
> > >>>>
> > >>>>
> > >>>> If I understand correctly, this is being done solely to speed up
> > >>>> loading list of notebooks? What if a list of notebook names, their
> ids,
> > >>>> folder structure, etc can be *cached* in a separate small json
> file? Or
> > >>>> perhaps in a small embedded key-value store, like www.mapdb.org
> would
> > >>>> do? Just thinking out loud. This would require a way to lazily
> re-sync the
> > >>>> cache.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Another way to speed up json reads is to somehow force "name"
> attribute
> > >>>> to be at the top of the json document that's written to disk. Then
> > >>>> re-implement json files reader to read just header of the file and
> do a
> > >>>> partial json parse ( or in the lack of options, grab "name"
> attribute from
> > >>>> the json file header by a regex for example).
> > >>>>
> > >>>>
> > >>>>
> > >>>> Back to filenames and charsets, I think issue may be more
> complicated,
> > >>>> if you store notebooks on a remote filesystem (nfs/ samba etc), and
> what if
> > >>>> remote server and local nfs client have differences in default fs
> charsets?
> > >>>>
> > >>>>
> > >>>>
> > >>>> Ideally would be if all filesystems would use UTF-8 for example,
> but I
> > >>>> am not certain that's a good assumption to make. Also exposing
> notebook
> > >>>> names can bring some other issues, like I know some users
> occasionally add
> > >>>> trailing/leading spaces etc.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Aug 13, 2018 at 10:38 AM Belousov Maksim Eduardovich <
> > >>>> m.belou...@tinkoff.ru> wrote:
> > >>>>
> > >>>> The use of Russian and other specific letters in the note name is
> big
> > >>>> advantage of Zeppelin. I would not like to give up this
> functionality.
> > >>>>
> > >>>>
> > >>>>
> > >>>> I support the idea about `zpln` file extension.
> > >>>>
> > >>>> The folder structure also sounds good.
> > >>>>
> > >>>>
> > >>>>
> > >>>> I'm afraid about non-latin symbols in folder and note name. And what
> > >>>> about hieroglyphs?
> > >>>>
> > >>>>
> > >>>>
> > >>>> Apache Zeppelin may be the first to use Russian letters in file
> system
> > >>>> in our company.
> > >>>>
> > >>>> I see a lot of risks to use non-latin symbols and a lot of issues to
> > >>>> make new folder structure stable.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> ------------------------------
> > >>>>
> > >>>> *От:* Jeff Zhang <zjf...@gmail.com>
> > >>>> *Отправлено:* 13 августа 2018 г. 12:50
> > >>>> *Кому:* users@zeppelin.apache.org
> > >>>> *Тема:* Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln
> instead
> > >>>> of [NOTEID]/note.json
> > >>>>
> > >>>>
> > >>>>
> > >>>> >>> Do we need the note id in the file name at all? What’s wrong
> with
> > >>>> just note_name.zpln?
> > >>>>
> > >>>> The reason I keep note id is because currently we use noteId to
> > >>>> identify one note. e.g. we use note id in both websocket api and
> rest api.
> > >>>> It is almost impossible to remove noteId for the current
> architecture. If
> > >>>> we put note id into file content of note_name.zpln, then we have to
> read
> > >>>> the note file every time, then we meet the issues I mentioned above
> again.
> > >>>>
> > >>>>
> > >>>>
> > >>>> >>> If the file content is json then why not use note_name.json
> instead
> > >>>> of .zpln? That would make it easier for editors to know how to
> > >>>> load/highlight the file contents.
> > >>>>
> > >>>> I am not strongly biased on *.zpln. But I think one purpose is to
> help
> > >>>> third parties to identify zeppelin note properly. e.g. github can
> identify
> > >>>> jupyter notebook (*.ipynb) and render it properly.
> > >>>>
> > >>>>
> > >>>>
> > >>>> >>> Is there any reason for not using *real* folders or directories
> > >>>> for organising the notebooks rather than embedding the folder
> hierarchy in
> > >>>> the names of the notebooks?  If someone wants to ‘move’ the
> notebooks to
> > >>>> another folder they’d have to manually rename all the
> files/notebooks at
> > >>>> present.  That’s not very user-friendly.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Actually my proposal is to use real folders. What user see in
> zeppelin
> > >>>> note menu is the actual notes folder structure. If they want to
> move the
> > >>>> notebooks to another folder, they can change the folder name just
> like what
> > >>>> user did in file system.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Partridge, Lucas (GE Aviation) <lucas.partri...@ge.com>于2018年8月13日周一
> 下午
> > >>>> 4:43写道:
> > >>>>
> > >>>> Hi Jeff,
> > >>>>
> > >>>> I have some questions about this proposal (I can’t edit the design
> doc):
> > >>>>
> > >>>>
> > >>>>
> > >>>>    1. Do we need the note id in the file name at all? What’s wrong
> > >>>>    with just note_name.zpln?
> > >>>>    2. If the file content is json then why not use note_name.json
> > >>>>    instead of .zpln? That would make it easier for editors to know
> how to
> > >>>>    load/highlight the file contents.
> > >>>>    3. Is there any reason for not using *real* folders or
> directories
> > >>>>    for organising the notebooks rather than embedding the folder
> hierarchy in
> > >>>>    the names of the notebooks?  If someone wants to ‘move’ the
> notebooks to
> > >>>>    another folder they’d have to manually rename all the
> files/notebooks at
> > >>>>    present.  That’s not very user-friendly.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Thanks, Lucas.
> > >>>>
> > >>>> *From:* Jeff Zhang <zjf...@gmail.com>
> > >>>> *Sent:* 13 August 2018 09:06
> > >>>> *To:* users@zeppelin.apache.org
> > >>>> *Cc:* dev <d...@zeppelin.apache.org>
> > >>>> *Subject:* EXT: Re: [DISCUSS] ZEPPELIN-2619. Save note in
> [Title].zpln
> > >>>> instead of [NOTEID]/note.json
> > >>>>
> > >>>>
> > >>>>
> > >>>> In that case, zeppelin should fail to create note.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Felix Cheung <felixcheun...@hotmail.com>于2018年8月13日周一 下午3:47写道:
> > >>>>
> > >>>> Perhaps one concern is users having characters in note name that are
> > >>>> invalid for file name/file path?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> ------------------------------
> > >>>>
> > >>>> *From:* Mohit Jaggi <mohitja...@gmail.com>
> > >>>> *Sent:* Sunday, August 12, 2018 6:02 PM
> > >>>> *To:* users@zeppelin.apache.org
> > >>>> *Cc:* dev
> > >>>> *Subject:* Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln
> > >>>> instead of [NOTEID]/note.json
> > >>>>
> > >>>>
> > >>>>
> > >>>> sounds like a good idea!
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Sun, Aug 12, 2018 at 5:34 PM Jeff Zhang <zjf...@gmail.com>
> wrote:
> > >>>>
> > >>>> Motivation
> > >>>>
> > >>>>    The motivation of ZEPPELIN-2619 is to change the notes storage
> > >>>> structure. Previously we store it using {noteId}/note.json, we’d
> like to
> > >>>> change it into {note_name}_{note_id}.zpln. There are several
> reasons for
> > >>>> this change.
> > >>>>
> > >>>>
> > >>>>
> > >>>>    1. {noteId}/note.json is not scalable. We put all notes in one
> root
> > >>>>    folder in flat structure. And when zeppelin server starts, we
> need to read
> > >>>>    all note.json to get the note file name and build the note
> folder structure
> > >>>>    (Because we need to get the note name which is stored in
> note.json to build
> > >>>>    the notebook menu). This would be a nightmare when you have
> large amounts
> > >>>>    of notes.
> > >>>>    2. {noteId}/note.json is not maintainable. It is difficult for a
> > >>>>    developer/administrator to find note file based on note name.
> > >>>>    3. {noteId}/note.json has no folder structure. Currently zeppelin
> > >>>>    have to build the folder structure internally in memory
> according note name
> > >>>>    which is a big overhead.
> > >>>>
> > >>>>
> > >>>> New Approach
> > >>>>
> > >>>>    As I mentioned above, I propose to change the note storage
> structure
> > >>>> to {note_name}_{note_id}.zpln.  note_name could contains folders,
> e.g.
> > >>>> folder_1/mynote_abcd.zpln
> > >>>>
> > >>>> This kind of note storage structure could bring several benefits.
> > >>>>
> > >>>>    1. We don’t need to load all notes when zeppelin starts. We just
> > >>>>    need to list each folder to get the note name and note_id.
> > >>>>    2. It is much maintainable so that it is easy to find the note
> file
> > >>>>    based on note name.
> > >>>>    3. It has the folder structure already. That can be mapped to the
> > >>>>    note folder structure.
> > >>>>
> > >>>>
> > >>>> Side Effect
> > >>>>
> > >>>> This approach only works for file system storage, so that means we
> have
> > >>>> to drop support for MongoNotebookRepo. I think it is ok because I
> didn’t
> > >>>> see any users talk about this in community, so I assume no one is
> using it.
> > >>>>
> > >>>>
> > >>>>
> > >>>> This is overall design, welcome any comments and feedback. Thanks.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Here's the google docs, you can also comment it here.
> > >>>>
> > >>>>
> > >>>> https://docs.google.com/document/d/126egAQmhQOL4ynxJ3AQJQRBBLdW8T
> ATYcGkDL1DNZoE/edit?usp=sharing
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >> 이종열, Jongyoul Lee, 李宗烈
> > >> http://madeng.net
> > >>
> > >
> >
>



-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Reply via email to