That it felt it was too long, I asked too much of you, and it decided the best thing I could do for you was just to throw a page out of the list. So you will notice the handout that we gave out on Wednesday, it goes straight from Page 4 to Page 7 I think or something like that. It doesnft really quite make sense when you read it because itfs in the middle of talking about one problem, and then suddenly itfs talking about something else. We photocopied the missing page and itfs in the lobby on the way out. You can also just get it from the web. The PDF thatfs up therefs complete. You can print the middle page out by yourself or just grab the whole PDF to take a look at, but you do wanna have that handy when you start working on the assignment because it will be quite mysterious if you try to read it as is with that missing page. What are we going on to talk about? Ifm gonna talk about functions as data a little bit and client callbacks, and hopefully a little bit of introduction to recursion. We wonft get very far in recursion today, just enough to get some terminology and see a couple simple examples, but recursion is gonna be sort of the entirety of next week, so wefll have lots of time to soak this in and become masters of it. And that reading that matches with that is basically Chapter 4, 5, and 6 in order, which is all the recursion chapters in the text. As you can probably tell, Ifm still working on getting rid of my strep. Technically, Ifm not contagious. Apparently, once youfve been on antibiotics for 24 hours, youfre no longer spreading the germs, but Ifd just as soon not want you to hang out with me today more than you need to. Ifm not sure itfs good for you or for me for that matter, but I will be actually in my office for a while after class, so if you were hoping to come see me on Wednesday, and you couldnft because I canceled my hours or you just have something on your mind, feel free to walk back with me and talk a little bit today. Anything administratively? Way in the back? [Student:][Inaudible] that handout [inaudible]? Oh, it will. Hefs asking about solutions to Section 2, and wefre -- I bet you Jason could have it up before the end of the -- he could put it up right now. How about that? We try to hold them back until actually after people have had section, just because it helps to kind of increase the mystery, keep the suspense, but then sometimes we forget to kind of patch together. So if we ever do that, send us an e-mail. Jasonfs actually your section go-to guy actually, so if you actually have questions about the section materials, he is the most on top of that, more so than I am. Anything else? Okay, so thank you for coming out in the rain. I appreciate you trudging over here and risking your health and life. So Ifm gonna do a little diversion here to explain this idea of functions as data before I come back to fixing the problem we had kind of hit upon at the very end of the last lecture about set. So what Ifm just gonna show you is a little bit about the design of something in C++ that is gonna helpful in solving the problem wefd run into. What Ifm gonna show you here is two pieces of code that plot a single value function across the number line. So in this case, the function is that itfs the top one is applying the sine function, the sine wave as it varies across, and then the square root function which goes up and has a sloping curve to it, and that both of these have exactly the same behaviors. Theyfre designed to kind of go into the graphics window, and theyfre just using a kind of a very simple kind of hand moving strategy. It starts on a particular location based on what the value of sine is here, and then over a particular interval at 0.1 of one tenth of an inch, it plots the next point and connects a line between them. So it does a simple kind of line approximation of what that function looks like over the interval you asked it to plot. The same code is being used here for plotting the square root. And the thing to note and I tried to highlight by putting it in blue here was that every other part of the code is exactly the same except for those two calls where I need to know whatfs the value of square root starting at this particular X. So starting at Value 1, what is sine of one? What is square root of one? As I work my way across the interval, whatfs a square root of 1.1, 1.2, 1.3 and sine of those same values? And so everything about the code is functionally identical. Itfs kind of frustrating to look at it, though, and realize if I wanted to plot a bunch of other things -- I also wanna be able to plot the cosine function, or some other function of my own creation -- across this interval that I keep having to copy and paste this code and duplicate it. And so one of the things that hopefully your 106A and prior experiences really heightened your attention to is if I have the same piece of code in multiple places, I ought to be able to unify it. I ought to be able to make there be a plot function that can handle both sine and square root without actually having to distinguish it by copy and pasting, so I wanna unify these two. And so there is -- the mechanism that wefre gonna use kinda follows naturally if you donft let yourself get too tripped up by what it means. Just imagine that the parameter for example going into the function right now are the start and the stop, the interval from X is one to five. What wefd like to do is further parameterize the function. Wefd like to add a third argument to it, which is to say and when youfre ready to know what function youfre plotting, herefs the one to use. Ifd like you to plot sine over the interval start to stop. Ifd like you to plot square root over that. So we added the third argument, which was the function we wanted to invoke there. Then we would be able to unify this down to where we had one generic plot function. The good news is that this does actually have support features in the C++ language that let us do this. The syntax for it is a little bit unusual, and if you think about it too much, it can actually make your head hurt a little bit, think about what itfs doing. But you can actually use a function, a functionfs name and the code that is associated with that name, as a piece of data that not just -- you think of the code as wefre calling this function, and wefre moving these things around, and executing things. The things that we tend to be executing and operating on you think of as being integers, and strings, and characters, and files. But you can also extend your notion of whatfs data to include the code you wrote as part of the possibilities. So in this case, Ifve added that third argument that I wanted to plot, and this syntax here thatfs a little bit unusual to you, and Ifll kind of identify it, is that the name of the parameter is actually FN. Its name is enclosed in parentheses there. And then to the right would be a list of the arguments or the prototype information about this function, and on the left is its return value. So what this says is you have two doubles, the start and stop, and the third thing in there isnft a double at all. It is a function of one double that returns a double, so a single value function that operates on doubles here. That is the syntax in C++ for specifying what you want coming in here is not a double, not a ray of doubles, anything funky like that. It is a function of a double that returns a double. And then in the body of this code, when I say FN here where I wouldfve said sine, or square root, or identified a particular function, itfs using the parameter that was passed in by the client, so it says call the clientfs function passing these values to plot over the range using their function in. So the idea is that valid calls to plot now become things like plot -- and then give it an interval, zero to two, and you give the name of a function: the sine function, which comes out of the math library, the square root function, also in the math library. It could be that the function is something you wrote yourself, the my function. In order for this to be valid though, you canft just put any old function name there. It is actually being quite specific about it that plot was to find [inaudible]. It took a double, returned a double. Thatfs the kind of function that you can give it, so any function that has that prototype, so it matches that format is an acceptable one to pass in. If you try to pass something that actually just does some other kind of function, it doesnft have the same prototype, so the get line that takes no arguments and returns a string just doesnft match on any front. And if I try to say here, plot the get line function of the integral two to five, it will quite rightfully complain to me that that just doesnft make sense. Thatfs because it doesnft match. So a little bit of syntax here, but actually kind of a very powerful thing, and it allows us to write to an addition to kind of parameterizing on these things you think of as traditional data, integers, and strings, and whatnot. Itfs also say as part of your operations, you may need to make a call out to some other function. Letfs leave it open what that function is and allow the client to specify what function to call at that time. So in this case for a plot, what function that youfre trying to plot, let the client tell you, and then based on what they ask you to plot you can plot different things. All right. Way in the back. [Student:]Is there a similar setup for multivariable functions [inaudible]? Certainly. So all I would need to do if this was a function that took a couple arguments, I would say double comma double comma int. If it returned void, [inaudible] returned. Sometimes it looks a little bit like the prototype kinda taken out of context and stuffed in there, and then those parens around the function are a very important part of that which is telling you yeah, this is a function of these with this prototype information. Behind you. No? Youfre good now. Somebody else? Over here? [Student:]Is that FN a fixed name [inaudible]? No. Itfs just like any parameter name. You get to pick it. So I couldfve called it plot function, my function, your function, whatever I wanted. Here in the front. [Student:][Inaudible] Java? So Java doesnft really have a similar mechanism that looks like this. C does, so C++ inherits it from C. There are other ways you try to accomplish this in Java. It tries to support the same functionality in the end, but it uses a pretty different approach than a functions as data approach. [Inaudible]? [Student:]Can you pass like a method, like an operator? So typically not. This syntax thatfs being used here is for a free function, a function thatfs kind of out in the global namespace, that level. There is a different syntax for passing a method, which is a little bit more messy, and we wonft tend to need it, so I wonft go there with you, but it does exist. It just as it stands does not want a method. It wants a function. So a method meaning member function of a class. Okay, so let me -- that was kind of just to set the groundwork for the problem we were trying to solve in set, which was set is holding this collection of elements that the client has stuffed in there. Itfs a generic templative class, so it doesnft have any preconceived notion about whatfs being stored. Are the strings? Are they student structures? Are they integers? And that in order to perform its operations efficiently, it actually is using this notion of ordering, keeping them in an order so that it can iterate an order. It can quickly find on the basis of using order to quickly decide where something has to be and if itfs present in the collection. So how does it know how to compare something thatfs of unknown type? Well, what it does is it has a default strategy. It makes an assumption that if I used equals equals and less than that would tell me kinda where to go. And so itfs got this idea that it wants to know. Wefll give it two things. Are they the same thing? In which case, it uses kinda the zero to show the ordering between them. If one precedes the other, it wants to have some negative number that says well this one precedes it, or some positive number if it follows it. So it applies this operation to the [inaudible] that are there, and for strings, and ints, and doubles, and characters, this works perfectly well. And so thatfs how without us going out of our way, we can have sets of things that respond to the built in relational operators without any special effort as a client. But what we can get into trouble with right is when equals equals less than donft make sense for type. So let me just go actually type some code, and Ifll show you. I have it on the slide, but Ifm gonna -- wow, this chair suddenly got really short. [Inaudible] fix that. Okay. Wefll go over here because I think itfs better just to see it really happening, so Ifm gonna ignore this piece of code because itfs not what I wanted. But if I make some student T structure, and itfs got the first and last name, and itfs got the ID number, and maybe thatfs all I want for now -- that if down in my main -- canft find my main. There it is. Ifm gonna need that piece of code later, so Ifm gonna leave it there. If I make a set, and I say Ifd like a set of students -- my class -- and I do this. And so I feel like I havenft gotten [inaudible]. I made this structure. I say Ifd like to make a set of students, that each student is in the class exactly once and I donft need any duplicates, and I go through the process of trying to compile this, itfs gonna give me some complaints. And its complaint, which is a little bit hard to see up here, is therefs no match for operator equals equals in one, equals equals two, and operator less than in one equals two. So itfs kind of showing me another piece of code thatfs kind of a hidden piece of code I havenft actually seen directly, which is this operator compare function. And that is the one that the set is using, as it has this idea of whatfs the way it should compare two things. It says well I have some of this operator compare that works generically on any type of things using equals equals and less than. And if I click up here on the instantiate from here, itfs gonna help me to understand what caused this problem. The problem was caused by trying to create a set who was holding student T. And so this gives you a little bit of an insight on how the template operations are handled by the compiler, that I have built this whole set class, and it depends on there being equals equals and less than working for the type. The fact that it didnft work for student T wasnft a cause for alarm until I actually tried to instantiate it. So at the point where I said Ifd like to make a set holding student T, the first point where the compiler actually goes through the process of generating a whole set class, the set angle brackets student T, filling in all the details, kinda working it all out, making the add, and contains, and whatever operations, and then in the process of those, those things are making calls that try to take student T objects and compare them using less than and equals equals. And that causes that code to fail. So the code thatfs really failing is kind of somewhere in the class library, but itfs failing because the things that wefre passing through and instantiating for donft work with that setup. So itfs something wefve gotta fix, wefve gotta do something about. Let me go back here and say what am I gonna do. Well, so therefs my error message. Same one, right? Saying yeah, you canft do that with a [inaudible]. Well, what we do is we use this notion of functions as data to work out a solution for this problem that if you think about kinda whatfs going on, the set actually knows everything about how to store things. Itfs a very fancy efficient structure that says given your things, keeps them in order, and it manages to update, and insert, and search that thing very efficiently. But it doesnft know given any two random things how to compare them other than this assumption it was making about less than and equals equals being a way to tell. If it wants to have sort of a more sophisticated handling of that, what it needs to do is cooperate with the client -- that the implementer of the set canft do it all. So therefs these two programmers that need to work in harmony. So what the set does is it allows for the client to specify by providing a function. It says well when I need to compare two things, how about you give me the name of a function that when given two elements will return to me their ordering, this integer zero, negative, positive that tells me how to put them in place. And so the set kind of writes all of its operations in terms of well therefs some function I can call, this function that will compare two things. If they donft specify one, Ifll use this default one that maps them to the relationals, but if they do give me one, Ifll just ask them to do the comparison. And so then as its doing its searching and inserting and whatnot, itfs calling back. We call that calling back to the client. So the client writes a function. If I wanna put student Ts into a set, then I need to say when you compare two student Ts, what do you look at to know if theyfre the same or how to order them. So maybe Ifm gonna order them by ID number. Maybe Ifm gonna use their first and last name. Whatever it means for two things to be equal and have some sense of order, I supply -- I write the function, and then I pass it to the set constructor. I say herefs the function to use. The set will hold on to that function. So I say herefs the compare student structure function. It holds onto that name, and when needed it calls back. It says Ifm about to go look for a student structure. Is this the one? Well, I donft know if two student structures are the same. Ifll ask the client. Herefs two student structures. Are they the same? And then as needed, itfll keep looking or insert and add and do whatever I need to do. So letfs go back over here. Ifll write a little function. So the prototype for it is it takes two elem Ts and it returns an int. That int is expected to have a value zero if they are the same, and a value that is negative if the first argument precedes the second. So if A is less than B, returns some negative thing. You can return negative one, or negative 100, or negative one million, but you need to return some negative value. And then if A [cuts out] some ordering, so theyfre not equal [cuts out] later, it will return some positive value: 1, 10, 6 million. So if I do [cuts out] say I use ID num as my comparison. Based on [cuts out] are the same, if they are I can return zero. And if ID num of A is less than the ID num of B, I can return negative one, and then in the other case, Ifll return one. So it will compare them on the basis [cuts out] figuring that the name field at that point is nothing new. And then the way I use that is over here when Ifm constructing it is there is [cuts out] to the constructor, and so thatfs how I would pass [cuts out] add parens to the [cuts out] as Ifm declaring it, and then I pass the name. Do I call it compare student or compare students? I canft remember. Compare student. Okay. And this then -- so now therefs nothing going on in the code -- causes it to compile, and that if you were to put letfs say a see out statement in your comparison function just for fun, you would find out as you were doing adds, and compares, and removes, and whatnot on this set that you would see that your call kept being made. It kept calling back to you as the client saying I need to compare these things. I need to compare these things to decide where to put something, whether it had something, and whatnot. And then based on your ordering, that would control for example how the iterator worked, that the smallest one according to your function would be the one first returned by your iterator, and it would move through larger or sort of later in the ordering ones until the end. So itfs a very powerful mechanism thatfs at work here because it means that for anything you wanna put in a set, as long as youfre willing to say how it is you compare it, then the set will take over and do the very efficient storing, and searching, and organizing of that. But you, the only piece you need to supply is this one little thing it canft figure out for itself, which is given your type of thing, how do you compare it. For the built in types string, and int, and double, and car, it does have a default comparison function, that one that was called operator compare. Let me go open it for you [inaudible] the 106. So there is a compare function dot H, and this is actually what the default version of it looks. Itfs actually written as a template itself that given two things it just turns around and asks the built in operators to help us out with that. And that is the name thatfs being used if I open the set and you look at its constructor call. I had said that I would come back and tell you about what this was that the argument going into the set constructor is one parameter whose name is CMP function that takes two elem type things -- so herefs the one in the example of the two argument prototype -- returns an int, and then it uses a default assignment for that of operator compare, the one we just looked at, so that if you donft specify it, it goes through and generates the standard comparison function for the type, which for built-ins will work fine, but for user defined things is gonna create compiler errors. So you can also choose if you donft like the default ordering works. So for example, if you wanted to build a set of strings that was case insensitive -- so the default string handling would be to use equals equals and less than, which actually does care about case. It doesnft think that capital [inaudible] is the same as lower case. If you wanted it to consider them the same, you could supply your own. A compare case insensitively function took two strings, converted their case, and then compared them. And then when you establish a set of strings, instead of letting the default argument take over, go ahead and use your case insensitive compare, and then now you have a set of strings that operates case insensitively. So you can change the ordering, adapt the ordering, whatever you like for the primitives, as well as supply the necessary one for the things the built-ins donft have properties for. So then thatfs that piece of code right there. All right. So does that make sense? Well, now you know kind of the whole range of things that are available in the class library. All right, so we saw the four sequential containers, the vector, stack, queue, and the grid that you kinda indexed ordering, and kinda allowed you to throw things in and get them back out. We went through the map, which is the sort of fancy heavy lifter that does that key value lookup, and then wefve seen the set, which does kind of aggregate collection management, and very efficient operations for kind of searching, retrieving, ordering, joining with other kind of sets, and stuff that also has a lot of high utility. I wanna do one quick little program with you before I start recursion just because I think itfs kinda cool is to talk a little about this idea of like once you have these ADTs, you can solve a lot of cool problems, and thatfs certainly what this weekfs assignment is about. Itfs like well here are these tasks that if you didnft have -- So ADTs, just a reminder -- I say this word as though everybody knows exactly what it means -- is just the idea of an abstract data type. So an abstract data type is a data type that you think of in terms of what it provides as an abstraction. So a queue is this FIFO line, and how it works internally, what itfs implemented as, wefre not worried about it at all. Wefre only worried about the abstraction of enqueue and dequeue, and it coming first in first out. So we talk about ADTs, we say once we have a queue, a stack, or a vector, we know what those things do, what a mathematical set is about. We build on top of that. We write code that leverages those ADTs to do cool things without having to also manage the low level details of where the memory came from for these things, how they grow, how they search, how they store and organize the data. You just get to do real cool things. So you probably got a little taste of that at the end of 106A when you get to use the array list and the hashmap to do things. This set kinda just expands out to fill out some other niches where you can do a lot of really cool things because you have these things around to build on. So one of the things that happens a lot is you tend to do layered things, and youfll see a little bit of this in the assignment youfre doing this week where itfs not just a set of something, itfs a set of a map of something, or a vector of queues, a map of set. So I gave you a couple of examples here of the things that might be useful. Like if you think of what a smoothie is, itfs a set of things mixed together, some yogurt, and some different fruits, some wheatgrass, whatever it is you have in it. And that the menu for a smoothie shop is really just a bunch of those sets, so each set of ingredients is a particular smoothie they have, and then the set of all those sets is the menu that they post up on the board you can come in and order. The compiler tends to use a map to keep track of all the variables that are in scope. As you declare variables, it adds them to the map so that when you later use that name, it knows where to find it. Well, it also has to manage though not just one map, but the idea is as you enter and exit scopes, there is this layering of open scopes. So you have some open scope here. You go into a for loop where you open another scope. You add some new variables that when you look it actually shadows then at the nearest definition, so if you had two variables of the same, it needs to look at the one thatfs closest. And then when you exit that scope, it needs those variables to go away and no longer be accessible. So one model for that could be very much a stack of maps where each of those maps represents the scope thatfs active, a set of variable names, and maybe their types and information is stored in that map. And then you stack them up. As you open a scope, you push on a new empty map. You put things into it, and then maybe you enter another scope. You push on another new empty map. You stick things into it, but as you exit and hit those closing braces, you pop those things from the stack to get to this previous environment you were in. So let me do a little program with you. I just have this idea of how this would work, and wefll see if I can code something up. So I have -- letfs go back over here. Ifm going to -- this is the piece of code that just reads words, so thatfs a fine piece of code to have here. I have a file here -- let me open it up for you -- that just contains the contents of the Official Scrabble Playersf Dictionary, Edition 2. Itfs got a lot of words it. Itfs pretty long. Itfs still loading. Letfs go back and do something else while itfs loading. It happened to have about 120,000 words I think is what it would be right now. So itfs busy loading, and I have this question for you. There are certain words that are anagrams of each other. The word cheap can be anagrammed into the word peach, things like that. And so I am curious for the Official Scrabble Playersf Dictionary what -- so if you imagine that some words can be anagrammed a couple times, five or six different words just on how you can rearrange the letters. Ifm curious to know what the largest anagram cluster is in the Official Scrabble Playersf Dictionary. So Ifd like to know across all 127,000 words that they form little clusters, and Ifd like to find out whatfs the biggest of those clusters. Okay. Thatfs my goal. So herefs what Ifve got going. Ifve got something thatfs gonna read them one by one. So letfs brainstorm for a second. I want a way to take a particular word and kinda stick it with its other anagram cluster friends. Whatfs a way I might do that? Help me design my data structure. Help me out. [Inaudible]. Ifve got the word peach. Where should I stick it so I can -- [Student:]You could treat each string as like a set of -- So I have this string, which represents the letters. Ifve got the word peach. I wanna be able to stick peach with cheap, so where should I stick peach in such a way that I could find it. And youfve got this idea that the letters there are a set. Theyfre not quite a set, though. Be careful because the word banana has a couple As and a couple Ns, and so itfs not that Ifd really want it to come down to be the set BAN. I wouldnft wanna coalesce the duplicates on that, so I really do wanna preserve all the letters that are in there, but your ideafs getting us somewhere. Itfs like there is this idea of kind of like for any particular word therefs the collection of letters that it is formed from. And somehow if I could use that as an identifier in a way that was reliable -- anybody got any ideas about what to do with that? [Student:]If you did that a vector, each letter and then the frequency of each letter in the word? So I could certainly do that. I could build kind of a vector that had kinda frequencies, that had this little struct that maybe it was number of times it occurs. Then I could try to build something on the basis of that vector that was like here is -- do these two vectors match? Does banana match apple? And youfd say well no. It turns out they donft have the same letters. Ifm trying to think of a really easy way to represent that. Your ideafs good, but Ifm thinking really lazy. So somebody help me whofs lazy. [Student:]Could you have a map where the key is all the letters in alphabetical order and the value is a vector of all the -- Yeah. That is a great idea. Youfre taking his idea, and youfre making it easier. Youfre capitalizing on lazy, which is yeah -- I wanna keep track of all the letters that are in the word, but I wanna do it in a way that makes it really easy for example to know whether two have the same -- wefll call it the signature. The signature of the word is the letter frequency across it. If I could come up with a way to represent the signature that was really to compare two signatures quickly to see if theyfre the same, then I will have less work to do. And so your idea is a great one. We take the letters and we alphabetize them. So cheap and peach both turn into the same ACEHP. Itfs a nonsense thing, but itfs a signature. Itfs unique for any anagram, and if I use a map where thatfs the key, then I can associate with every word that had that same signature. So letfs start building that. So let me write -- I wanna call this the signature. Given a string, itfs going to alphabetize it. Ifm going to write the dumbest version of this ever, but Ifm just gonna use a simple sorting routine on this. So smallest is gonna small index -- wefll call it min index. Thatfs a better name for it. Min index equals I, and then Ifm gonna look through the string, and Ifm gonna find the smallest letter thatfs there and move it to the front. Thatfs basically gonna be my strategy. Ifve got the wrong place to start, though. Ifm gonna look from I to the one. So if S sub J is less than S sub min index, so it is a smaller letter, then the min index gets to be J. So far, what Ifve done is Ifve run this loop that kind of starts in this case on the very first iteration and says the first character in Slot 0, that must be the min. And then it looks from there to the end. Is there anything smaller? Whenever it find anything smaller, it updates the min index, so when itfs done after that second loop has fully operated, then min index will point to the smallest alphabetically character in the string that we have. Then Ifm gonna swap it to the front. So S sub I with S sub min index, and Ifll write a swap because swap is pretty easy to write. See about how we do this, okay. So if I -- I like how this works, and then let me stop and say right here, now is a good time to test this thing. I just wrote signature. It probably works, but it potentially could not. And Ifm just gonna show you a little bit of how I write code. This is good to know. As I say, I put some code in here that I plan on throwing away. Enter word -- and then I throw it through my function. I think I called it signature, didnft I? Signature S -- so the idea being if it doesnft work, I wanna find out sooner rather than later. It doesnft like my use of get line. Is that because itfs not included? Yes, itfs not. So letfs go get the right header files. This is half the battle sometimes is figuring out what headers are needed to make your compiler happy. So now itfs up here at enter word, and I say what does cheap come out as? It goes into the debugger. Thatfs good. So it wasnft right. Now wefre gonna get to find out what does it not like about that. It says -- did we forget to return something? Letfs go look at our code. So itfs complaining about -- oh yeah, I can see where wefre gonna get in trouble here. Itfs complaining that the return -- it was saying that Ifm trying to print something and it looks like garbage, that what Ifm trying to print didnft make sense at all. I could say well thatfs funny. Letfs go look at signature. [Student:][Inaudible] pass the [inaudible]. Thatfs probably a good idea. We ought to fix that while wefre there. Let me leave that bug in for a minute because Ifm gonna fix my first bug, and then Ifll come back to that one. So it turns out what itfs really complaining about though is it has to do with -- I said well what does signature return. Somehow, whatfs being returned by signature is being interpreted as total crap when it got back to main, and therefs a very good reason for that because I never returned anything. So maybe if I had been less cavalier about the fact that it was giving me a warning over here that said control reaches the end of non-void function, but I was being lazy and I didnft look -- was that I didnft pay attention to the fact. So letfs leave my other bug in that youfve already pointed out because this is exactly what itfs like when youfre doing it. You enter words and you say cheap, and it says cheap, and youfre like no. And then youfre like about how about an. Hey look, thatfs in alphabetical order. And then you spend all your time thinking about words for a while. Well, tux. Thatfs in alphabetical order. It seems to work. You could do that for a while, but then youfre like this whole cheap, not good. Okay, so I come back here and my friend whofs already one step ahead of me has pointed out that my swap is missing the all important pass by reference that as it is it what swapping the copies that got passed to the function, but of course nothing was happening back here in signature land. So if I fix that, and I come back in here, Ifm gonna feel better about this. Oh, my gosh. And even tux still works. And if I say banana, I should get a bunch of As, a bunch of B and an N, so therefs various cases to test out if you have multiple letters, and if you have letters that are the same letters like apple so that it doesnft lose your duplicates, and it seems to come out right. So given this, the word peach should come down to the same signature as cheap, and so that seems to indicate wefre on the path toward building the thing we wanted to build. So I have this read file thing that I have left over from last time that reads each word, and I wanna change my vector into a map of set of string. What capitalization do you not like? [Student:][Inaudible]. Yeah. So it turns out this file happens to all be lower case, but therefs no harm in doing this. That way even if they werenft, itfll take care of that problem for us if we wanted it to. [Student:][Inaudible]. I want lower case. Oh yeah, I do. Well, your case. You guys are so picky. All right. Herefs my deal. Ifm gonna take my map, and this is gonna be a line of code thatfs gonna make your head spin. Just go with it. This is the do all craziness. Okay. So Ifve got this map, and what I wanna do is under the signature of this word, I want to look up the set of strings thatfs associated with it and tack this one in. And the add in this case with the set, I know thatfs gonna do non-duplication. In fact, the file doesnft contain duplicates, but if it did, I certainly donft want it to record it twice anyway, so I might as well do this. Now this form of this is a heavy lifter for a small piece of code. The signature then went and converted into the ACEHP form. I used that as the key into the table. If it was already there, itfs gonna retrieve me an existing set that Ifm gonna just go ahead and add a word onto. If it wasnft there, the behavior for the brackets is to create kind of a new empty value for that. So itfll use the key and create a default value for type. Well, the default value for set of string -- so if you just create a set without any other information in the default constructor will always create you a nice clean empty set. So in fact, it will get me exactly what I want which is to put it in there with an empty set that I will immediately add the word into. So after I do this, I should have this fully populated map. And then Ifm gonna do this just as a little test. Ifm gonna say num words when Ifm done to feel a little bit confident about what got put in there. How about I call it -- thatfs a good idea. Itfs so fussy. C++ never does what you want. I think I called this thing OSPD2.txt that has the words in it. And then I need to declare the variable that Ifm sticking all this stuff into is a set. So go in, load stuff, doing itfs thing, the number of words 112,882. Okay. Thatfs close enough. I canft remember the number thatfs in there, but that sounds like a fine approximation of it. So I feel like it did sort of manage to do something for it. And I can actually do this if I just wanna get a little glimpse of it is to use my iterator to look at something thatfs in there. Wait, is map a set of string? Iterator -- file iter.hasNext. Ifm gonna say key equals iter.next, and Ifm going to print that key, and Ifm going to print the size of the set because Ifm at this point -- I should see gobs of printing come out of this thing. It takes a little bit of a while to process the thing, and then see gobs and gobs of stuff going by. It looks like a lot of things are ones if you can imagine reading them as they go by because a lot of words are really just not anagrams of anything else. But some of the shorter ones have sort of a better chance. So you can find out here at the end that there are EEIKLPST. I donft know what that is. Leakiest? No. I donft know what that word is. I should write it in for any of them. This dictionary has a bunch of really crazy words in it too, so it makes it especially challenging. What is that? I donft know. That one almost looks like beginners, but itfs got an F in it. Itfs the F beginners, a very famous word. You guys have probably heard of it. So Ifve seen that, and now I wanna do the thing where I will pick the largest one. Ifd like to know. Somebody should tell me. Int max size [cuts out] max key, so Ifll set this to be zero, and max key is that. And then Ifm gonna do this. If the size of this key is greater than my max size, then itfs gonna get to be the new key. So after I did all of that then [cuts out] key. And then I probably wanna see what they are, so why donft I go ahead and take a look. Ah, I have to go to the type, though. [Inaudible] to equals M of key -- max key, I guess, dot iterator, [inaudible] IT has next, CLIT.next [inaudible]. So it went back. It found the maximum, in this case using the size of the sets as the distinguishing feature. And then max is AEPRS, which itfs got a big old list of about 12. I think thatfs 12, actually. Maybe 13. So now you know. You can impress your friends at parties. This is the kind of thing you can win bar bets on. Oh, yeah. Whatfs the size of the largest anagram cluster? Everybody wants to know this kind of stuff. I canft believe you guys can sleep at night without actually knowing this. And whatfs neat to know though [inaudible] just to point out a couple of things that -- you can use a little decomposition on this code, but therefs kind of a very small amount of things wefre having to do. For example, one of the things thatfs really powerful, things like the map where we can just figure out how to key the things to store the collection right under that key, then looking it up and adding something is a very efficient sort of direct operation, just building on these things and it going through and doing all the work of storing them, sorting them, making it efficient for us to retrieve them and look them up such that I can process 100,000 words in the blink of an eye, and then go back through, look at them all, find the biggest one, get my information out. [Student:]When you make the call to M.size, is that the number of words? [Inaudible]. That is the number of keys. [Student:]Keys, right. So thatfs not actually [inaudible]. Yeah. So it doesnft know anything about everything else that was [inaudible], but in fact thatfs why itfs [inaudible]. I know the dictionary has about 127,000 words. It turns out they form about 112 unique signatures, and so therefs actually another 20,000 words that are clung onto some existing signature. Thatfs the number of unique signatures across the dictionary, not the number of words, so thatfs probably the wrong name to call it. [Student:]For the M signature word thing where [inaudible] of the default just to create a new stack, that works as well for vectors [inaudible]? Yeah. It works for anything if you were just to declare it on the stack and the right thing happened, so vectors, set, maps. All those things do. But the primitive types like int and double, it doesnft. So it would work for string. String is an empty string. So for some of the fancier, more modern types tend to actually know how to just default construct themselves into a good state, but the primitives donft do that. So if you were having a map of ints and you wanted to have them start at zero, you need to really start them at zero. You can call just M sub this. It would be garbage, and it would just be operating with garbage from that way forward. All right. Well, wefre good. What Ifm gonna give you is the eight minute discussion of recursion that whets your appetite for the things wefre gonna be doing next week. So recursion is one of those things I think that when you havenft yet had a chance to explore it first hand and other people tell you about it, it has sort of an awe inspiring sort of mystery, some fear, and whatnot. So first off, I wanna kinda shake that fear off. It is a little bit hard to wrap your head around the first time you see it, but wefre gonna have a whole weekfs worth of time to spend on it, so wefre gonna try to give you a lot of different ways to think about it, and different problems to see to kinda help you do it. And I think once you do get your head around it, it turns out then youfll discover how infinitely powerful it is, that there is kind of a simple idea in it that once you kinda fully get your head around, you can explore and solve lots of different problems using this just one technique again and again. So in itself, itfs a little bit mysterious at first glance, but then once you kind of master it, youfll be amazed at the kind of neat things you can do with it. So it is certainly what Ifd consider an indispensable tool in a programmerfs tool kit. The kind of problems you can solve using the techniques you have so far is fundamentally limited. And part of what we need to do in this class is expose you to these new ways of solving harder, more sophisticated problems that the old techniques donft work for. One of the cool things about recursion is it actually lends very simple, elegant, short solutions to problems that at first glance seem completely unsolvable. That if you can formulate a structure for [inaudible], you will discover that the code is not long to write. Itfs not tedious to write. The tricky part is to figure out how to express it, so itfs more of a thinking puzzle than it is a coding puzzle. I certainly like thinking puzzles as much as coding puzzles if not more. The general sort of idea is that you are going to try to solve a problem -- instead of sort of breaking it down into component tasks like if I need to make dinner, I need to go to the store and buy things, and I need to come home, and chop them, and get the recipe. You think of what your standard decomposition is all about -- breaking down your tasks into A, B, and C, and D, and then you add them all together to get the whole task done. Recursion has this kind of very different way of thinking about the problem, which is like well if I needed to get Task A done, and I had Task A prime, which was somehow a lot like the task I was trying to solve, but it somehow was a little bit simpler, a little bit easier, a little bit more manageable than the one I started out to solve, and if I had that solution -- so if somehow I could delegate it off to some minion who works for me, and then I could use that to solve my problem, then my job would be made much easier by using that result of solving a similar problem thatfs a little bit [inaudible]. Okay, that seems a little bit wacky. Let me give you sort of an example of how this might work. So your standard problem, I said yeah, itfs like you do these dissimilar sub problems. Letfs imagine I had this goal where I wanted to survey the Stanford student body. I donft want just like a haphazard most of the people involved. Letfs say I really wanted to get input from every single person on campus whether they think having cardinal as your mascot is a ridiculous choice or not. So letfs imagine I really wanna hear from all 10,000 students. Now I can stand out in White Plaza with a big note pad and try to accost people and sort of work my way down the list. And then Ifd be there for eons and never solve my problem. Instead, what I decide to do is I say well Ifm gonna recruit some people to help me because Ifm lazy as wefve already established, and I would like to get some other people to join in my quest to answer these burning questions and to solve the survey. So what I do is I round up ten people letfs say, and I say would you help me, and I decide to divide the campus into kind of ten partitions. And I say if you could survey all the people whose names begin with A, B, and C, that would really help. And if you could do [inaudible] Gs, and if you would do -- and if I divide up the alphabet that way, give each of them two or three letters, and I say if you would go get the data, itfd be really easy for me to do my job then. If I just took all their data [inaudible]. Well, being the kind of lazy person that I am, itfs likely that the ten people I recruit would have similar lazy qualities because lazy people hang out with other lazy people. And so the person who was in charge of A, B, C, the first thing they do is turn around and find ten more friends, and then they divide it up and say could you do the AA through AM and so on. If they divide it into these pools of one tenth of what they were responsible for, and say you can go get the information from these people, and if they did the same thing -- so if everybody along the road. We started with 10,000. Now each person had 1,000 to survey. They asked their friend to do 100. Their friend asked ten people to do ten. And then at some point, the person who has ten says well I just need to ask these ten people. Once I get their data, we donft need to do anything further. So at some point the problem becomes so small, so simple, even though it was kind of the same problem all along. I just reproduced the same problem we had, but in a slightly more tractable form, but then I divided it around. Divide and conquer sometimes they call this to where I spread out the work around a bunch of people to where each personfs contribution is just a little part of the whole. You had to find the ten volunteers around underneath you and get their help in solving the problem, but nobody had to do much work, and thatfs kind of a really interesting way to solve a problem. It sounds like a very big problem of surveying 10,000 people, but by dividing and conquer, everybody has a little tiny role to play. You leverage a lot of people getting it done, and there is this self-similarity to it, which is kind of intriguing that everybody is trying to solve the same problem but just at different levels of scale. And so this idea applies to things like phone trees rather than trying to get the message out to everybody in my class, it might be that I call two people who call two friends who call two friends until everybody gets covered. Nobody does the whole job. Everybody just does a little part of it. Sometimes youfll see these fractal drawings where there is a large leaf which when you look closer actually consists of littler leaves, which themselves are littler leaves, so that at every level of scale the same image is being reproduced, and the result kind of on the outside is something that in itself if you look deeper has the same problem but smaller embedded in it. So itfs a pretty neat sort of way of solving things. I am gonna tell you a little bit about how the code looks, and then I really am not gonna be able to show you much code today. I think itfs actually even better to show you this code on Monday when we can come back fresh, but that it involves taking -- So wefre looking at functional recursion first, and functional recursion is taking some sort of functions, a function that takes an input and returns an answers, returns non-void thing that comes back. And wefre gonna be looking at problems where youfre trying to solve this big problem, and that if you had the answer to making a call to yourself on a smaller version of the problem -- maybe one call, maybe two calls, maybe several calls -- that you could add those, or multiply those, or combine those in some way to answer the bigger problem. So if I were trying to solve this campus survey, having the answers to these smaller campus surveys gives me that total result. And so the -- I really should not try to do this in two minutes. What I should do is try to tell you on Monday. Wefll come back and wefll talk about recursion, and it will be an impressive week. Meanwhile, work on your ADT homework.