Hi, I have a bit different thoughts about the conflicts of the name of a new note created. In a multiuser environment, AFAIK, most teams and companies, generally, use a prefix for the group policy internally. In my case, user/{user_id}/{notebook_name_they_want}.zpln. In this case, naming conflicts rarely happen. And it will be stored under a specific folder. If someone needed two different same named notes in the same directory, I might not be appropriate. WDYT?
JL On Fri, Aug 31, 2018 at 4:44 AM, andreas.we...@gmail.com < andreas.we...@gmail.com> wrote: > another reason for keeping noteId is uniqueness in case of multi-user > environments. In that case users have separate zeppelin workspaces, which > is something we are using in production: see ZEPPELIN_NOTEBOOK_PUBLIC=false > in the doc [1]. In that case users might be very confused when they can not > create notebooks with a name that already exists, but they most likely > don't see (yet). > > So I like the proposal {note_name}_{note_id}.zpln. where note_name could > contains folders, e.g. folder_1/mynote_abcd.zpln. Even though I like > {note_name}.{note_id}.zpln (dot in between note_name and note_id) even > better :-) > > Regards > Andreas > > > [1] http://zeppelin.apache.org/docs/0.8.0/setup/security/ > notebook_authorization.html#separate-notebook-workspaces-public-vs-private > > On 2018/08/18 08:42:44, Jeff Zhang <zjf...@gmail.com> wrote: > > BTW, I also prefer to use note name as identify of note if the issue I > > mentioned before is acceptable for most of users. > > > > > > > > Jeff Zhang <zjf...@gmail.com>于2018年8月18日周六 下午4:40写道: > > > > > > > > I am afraid we can not remove noteId, as noteId is the unique > identifier > > > of note and is immutable which is used in a lot places, such as > paragraph > > > share and rest api. > > > If we use note name as note id then it may break user's app if note > name > > > is changed > > > > > > > > > Jongyoul Lee <jongy...@gmail.com>于2018年8月18日周六 下午2:33写道: > > > > > >> Hi, thanks for this kind of discussion. > > >> > > >> About noteId, How about changing note id to note name? AFAIK, Note id > is > > >> just an identifier and we can set any value to it. > > >> > > >> There’re two potential problems. We should be more careful to handle > note > > >> id as it could have very various type of characters. And Second, in > case > > >> where someone changes a note name, those who are seeing and updating > the > > >> same note wouldn’t access that note. We could handle it by using > websockets. > > >> > > >> WDYT? > > >> > > >> On Tue, 14 Aug 2018 at 6:14 PM Jeff Zhang <zjf...@gmail.com> wrote: > > >> > > >>> >>> But I’m still not comfortable with note ids in the name of the > > >>> notebook itself. Those names would look ugly if you shared your > notebooks > > >>> on github for example. You don’t see Jupyter notebooks with names > like > > >>> that. If you have to keep the note ids with the notebooks could you > not > > >>> simply put the note id at the top of the notebook as Ruslan > suggested? Then > > >>> you’d only have to read the first line of each notebook. > > >>> > > >>> I know putting note_id in the note file name is not so elegant, but > this > > >>> is what we have to compromise to keep compatibility as we use noteId > to > > >>> uniquely identify note right now. And I don't think putting noteId > in the > > >>> top first line of note would help much. We still have to read note > files > > >>> which take much more time than just read the file names via file > system. > > >>> > > >>> Regarding the readability of note file name, I think it won't affect > > >>> much. E.g. This is the note book file name like: *My Project/My > Spark > > >>> Tutorial Note_2A94M5J1Z.zpln* > > >>> What user see in notebook menu is still *My Project/My Spark > Tutorial* *Note > > >>> *which is no difference from what we see now. > > >>> > > >>> And thanks again for the feedback and comments, I am so glad to see > so > > >>> many discussion in community. > > >>> > > >>> > > >>> > > >>> Partridge, Lucas (GE Aviation) <lucas.partri...@ge.com>于2018年8月14日周二 > > >>> 下午4:29写道: > > >>> > > >>>> I agree you’re inviting consistency issues if you maintained a > separate > > >>>> note id-to-note name mapping file. > > >>>> > > >>>> > > >>>> > > >>>> But I’m still not comfortable with note ids in the name of the > notebook > > >>>> itself. Those names would look ugly if you shared your notebooks > on github > > >>>> for example. You don’t see Jupyter notebooks with names like > that. If you > > >>>> have to keep the note ids with the notebooks could you not simply > put the > > >>>> note id at the top of the notebook as Ruslan suggested? Then you’d > only > > >>>> have to read the first line of each notebook. > > >>>> > > >>>> > > >>>> > > >>>> Presumably if you copied the notebooks to another Zeppelin server > they > > >>>> would be restored with the same note ids there too? And hopefully > there > > >>>> would be no id clash with notebooks already on that server… > > >>>> > > >>>> > > >>>> > > >>>> *From:* Jeff Zhang <zjf...@gmail.com> > > >>>> *Sent:* 14 August 2018 03:49 > > >>>> *To:* users@zeppelin.apache.org > > >>>> > > >>>> > > >>>> *Subject:* EXT: Re: [DISCUSS] ZEPPELIN-2619. Save note in > [Title].zpln > > >>>> instead of [NOTEID]/note.json > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Thanks for the discussion. > > >>>> > > >>>> >>> I'm afraid about non-latin symbols in folder and note name. And > > >>>> what about hieroglyphs? > > >>>> > > >>>> AFAIK, linux allow all the characters to be file name except `\0` > and > > >>>> '/'. I can create file name with Chinese character in linux, I > guess you > > >>>> can use Russian as well. > > >>>> > > >>>> > > >>>> > > >>>> >>> If I understand correctly, this is being done solely to speed up > > >>>> loading list of notebooks? What if a list of notebook names, their > ids, > > >>>> folder structure, etc can be *cached* in a separate small json > file? Or > > >>>> perhaps in a small embedded key-value store, like www.mapdb.org > would > > >>>> do? Just thinking out loud. This would require a way to lazily > re-sync the > > >>>> cache. > > >>>> > > >>>> > > >>>> > > >>>> This not only to speed up the loading but also make the system > > >>>> architecture easy to maintain. Because for now we have to build the > folder > > >>>> structure of notes in memory, many code in zeppelin is doing this > > >>>> (Personally I don't think we need any code for this function if we > could > > >>>> get the folder structure from the note file storage system). Use > another > > >>>> storage to keep the mapping of note name and note id will bring > another > > >>>> classic problem of distributed system: consistency. How do we make > sure the > > >>>> consistency between the real note file and this mapping component. > If we > > >>>> create/rename/remove note, we have to both update the notebook repo > and the > > >>>> mapping storage. Any bug in code would bring inconsistency issue > based on > > >>>> my experience. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Ruslan Dautkhanov <dautkha...@gmail.com>于2018年8月14日周二 上午3:58写道: > > >>>> > > >>>> Thanks for bringing this up for discussion. My 2 cents below. > > >>>> > > >>>> > > >>>> > > >>>> I am with Maksim and Felix on concerns with special characters now > > >>>> allowed in notebook names, and also concerns with different > charsets. > > >>>> Russian language, for example, most commonly use iso-8859-5, > koi-8r/u, > > >>>> windows-1251 charsets etc. This seems like will bring whole new set > of > > >>>> localization issues. > > >>>> > > >>>> > > >>>> > > >>>> If I understand correctly, this is being done solely to speed up > > >>>> loading list of notebooks? What if a list of notebook names, their > ids, > > >>>> folder structure, etc can be *cached* in a separate small json > file? Or > > >>>> perhaps in a small embedded key-value store, like www.mapdb.org > would > > >>>> do? Just thinking out loud. This would require a way to lazily > re-sync the > > >>>> cache. > > >>>> > > >>>> > > >>>> > > >>>> Another way to speed up json reads is to somehow force "name" > attribute > > >>>> to be at the top of the json document that's written to disk. Then > > >>>> re-implement json files reader to read just header of the file and > do a > > >>>> partial json parse ( or in the lack of options, grab "name" > attribute from > > >>>> the json file header by a regex for example). > > >>>> > > >>>> > > >>>> > > >>>> Back to filenames and charsets, I think issue may be more > complicated, > > >>>> if you store notebooks on a remote filesystem (nfs/ samba etc), and > what if > > >>>> remote server and local nfs client have differences in default fs > charsets? > > >>>> > > >>>> > > >>>> > > >>>> Ideally would be if all filesystems would use UTF-8 for example, > but I > > >>>> am not certain that's a good assumption to make. Also exposing > notebook > > >>>> names can bring some other issues, like I know some users > occasionally add > > >>>> trailing/leading spaces etc. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> On Mon, Aug 13, 2018 at 10:38 AM Belousov Maksim Eduardovich < > > >>>> m.belou...@tinkoff.ru> wrote: > > >>>> > > >>>> The use of Russian and other specific letters in the note name is > big > > >>>> advantage of Zeppelin. I would not like to give up this > functionality. > > >>>> > > >>>> > > >>>> > > >>>> I support the idea about `zpln` file extension. > > >>>> > > >>>> The folder structure also sounds good. > > >>>> > > >>>> > > >>>> > > >>>> I'm afraid about non-latin symbols in folder and note name. And what > > >>>> about hieroglyphs? > > >>>> > > >>>> > > >>>> > > >>>> Apache Zeppelin may be the first to use Russian letters in file > system > > >>>> in our company. > > >>>> > > >>>> I see a lot of risks to use non-latin symbols and a lot of issues to > > >>>> make new folder structure stable. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> ------------------------------ > > >>>> > > >>>> *От:* Jeff Zhang <zjf...@gmail.com> > > >>>> *Отправлено:* 13 августа 2018 г. 12:50 > > >>>> *Кому:* users@zeppelin.apache.org > > >>>> *Тема:* Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln > instead > > >>>> of [NOTEID]/note.json > > >>>> > > >>>> > > >>>> > > >>>> >>> Do we need the note id in the file name at all? What’s wrong > with > > >>>> just note_name.zpln? > > >>>> > > >>>> The reason I keep note id is because currently we use noteId to > > >>>> identify one note. e.g. we use note id in both websocket api and > rest api. > > >>>> It is almost impossible to remove noteId for the current > architecture. If > > >>>> we put note id into file content of note_name.zpln, then we have to > read > > >>>> the note file every time, then we meet the issues I mentioned above > again. > > >>>> > > >>>> > > >>>> > > >>>> >>> If the file content is json then why not use note_name.json > instead > > >>>> of .zpln? That would make it easier for editors to know how to > > >>>> load/highlight the file contents. > > >>>> > > >>>> I am not strongly biased on *.zpln. But I think one purpose is to > help > > >>>> third parties to identify zeppelin note properly. e.g. github can > identify > > >>>> jupyter notebook (*.ipynb) and render it properly. > > >>>> > > >>>> > > >>>> > > >>>> >>> Is there any reason for not using *real* folders or directories > > >>>> for organising the notebooks rather than embedding the folder > hierarchy in > > >>>> the names of the notebooks? If someone wants to ‘move’ the > notebooks to > > >>>> another folder they’d have to manually rename all the > files/notebooks at > > >>>> present. That’s not very user-friendly. > > >>>> > > >>>> > > >>>> > > >>>> Actually my proposal is to use real folders. What user see in > zeppelin > > >>>> note menu is the actual notes folder structure. If they want to > move the > > >>>> notebooks to another folder, they can change the folder name just > like what > > >>>> user did in file system. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Partridge, Lucas (GE Aviation) <lucas.partri...@ge.com>于2018年8月13日周一 > 下午 > > >>>> 4:43写道: > > >>>> > > >>>> Hi Jeff, > > >>>> > > >>>> I have some questions about this proposal (I can’t edit the design > doc): > > >>>> > > >>>> > > >>>> > > >>>> 1. Do we need the note id in the file name at all? What’s wrong > > >>>> with just note_name.zpln? > > >>>> 2. If the file content is json then why not use note_name.json > > >>>> instead of .zpln? That would make it easier for editors to know > how to > > >>>> load/highlight the file contents. > > >>>> 3. Is there any reason for not using *real* folders or > directories > > >>>> for organising the notebooks rather than embedding the folder > hierarchy in > > >>>> the names of the notebooks? If someone wants to ‘move’ the > notebooks to > > >>>> another folder they’d have to manually rename all the > files/notebooks at > > >>>> present. That’s not very user-friendly. > > >>>> > > >>>> > > >>>> > > >>>> Thanks, Lucas. > > >>>> > > >>>> *From:* Jeff Zhang <zjf...@gmail.com> > > >>>> *Sent:* 13 August 2018 09:06 > > >>>> *To:* users@zeppelin.apache.org > > >>>> *Cc:* dev <d...@zeppelin.apache.org> > > >>>> *Subject:* EXT: Re: [DISCUSS] ZEPPELIN-2619. Save note in > [Title].zpln > > >>>> instead of [NOTEID]/note.json > > >>>> > > >>>> > > >>>> > > >>>> In that case, zeppelin should fail to create note. > > >>>> > > >>>> > > >>>> > > >>>> Felix Cheung <felixcheun...@hotmail.com>于2018年8月13日周一 下午3:47写道: > > >>>> > > >>>> Perhaps one concern is users having characters in note name that are > > >>>> invalid for file name/file path? > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> ------------------------------ > > >>>> > > >>>> *From:* Mohit Jaggi <mohitja...@gmail.com> > > >>>> *Sent:* Sunday, August 12, 2018 6:02 PM > > >>>> *To:* users@zeppelin.apache.org > > >>>> *Cc:* dev > > >>>> *Subject:* Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln > > >>>> instead of [NOTEID]/note.json > > >>>> > > >>>> > > >>>> > > >>>> sounds like a good idea! > > >>>> > > >>>> > > >>>> > > >>>> On Sun, Aug 12, 2018 at 5:34 PM Jeff Zhang <zjf...@gmail.com> > wrote: > > >>>> > > >>>> Motivation > > >>>> > > >>>> The motivation of ZEPPELIN-2619 is to change the notes storage > > >>>> structure. Previously we store it using {noteId}/note.json, we’d > like to > > >>>> change it into {note_name}_{note_id}.zpln. There are several > reasons for > > >>>> this change. > > >>>> > > >>>> > > >>>> > > >>>> 1. {noteId}/note.json is not scalable. We put all notes in one > root > > >>>> folder in flat structure. And when zeppelin server starts, we > need to read > > >>>> all note.json to get the note file name and build the note > folder structure > > >>>> (Because we need to get the note name which is stored in > note.json to build > > >>>> the notebook menu). This would be a nightmare when you have > large amounts > > >>>> of notes. > > >>>> 2. {noteId}/note.json is not maintainable. It is difficult for a > > >>>> developer/administrator to find note file based on note name. > > >>>> 3. {noteId}/note.json has no folder structure. Currently zeppelin > > >>>> have to build the folder structure internally in memory > according note name > > >>>> which is a big overhead. > > >>>> > > >>>> > > >>>> New Approach > > >>>> > > >>>> As I mentioned above, I propose to change the note storage > structure > > >>>> to {note_name}_{note_id}.zpln. note_name could contains folders, > e.g. > > >>>> folder_1/mynote_abcd.zpln > > >>>> > > >>>> This kind of note storage structure could bring several benefits. > > >>>> > > >>>> 1. We don’t need to load all notes when zeppelin starts. We just > > >>>> need to list each folder to get the note name and note_id. > > >>>> 2. It is much maintainable so that it is easy to find the note > file > > >>>> based on note name. > > >>>> 3. It has the folder structure already. That can be mapped to the > > >>>> note folder structure. > > >>>> > > >>>> > > >>>> Side Effect > > >>>> > > >>>> This approach only works for file system storage, so that means we > have > > >>>> to drop support for MongoNotebookRepo. I think it is ok because I > didn’t > > >>>> see any users talk about this in community, so I assume no one is > using it. > > >>>> > > >>>> > > >>>> > > >>>> This is overall design, welcome any comments and feedback. Thanks. > > >>>> > > >>>> > > >>>> > > >>>> Here's the google docs, you can also comment it here. > > >>>> > > >>>> > > >>>> https://docs.google.com/document/d/126egAQmhQOL4ynxJ3AQJQRBBLdW8T > ATYcGkDL1DNZoE/edit?usp=sharing > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> -- > > >> 이종열, Jongyoul Lee, 李宗烈 > > >> http://madeng.net > > >> > > > > > > -- 이종열, Jongyoul Lee, 李宗烈 http://madeng.net