Data mining: what is it? How it works ?

Business News Data Mining: What Is It? How it works ? Posted 05/04/2021 10:24 AM, Updated 05/04/2021 10:25 AM You read that little fox Jolly Redd is coming to Animal Crossing: New Horizons before Nintendo even officially announced it? It’s easy because the smart kids were able to parse the lines of code while the lights read the coffee grounds. This phenomenon, commonly known as “data mining”, literally “data mining”, is not losing its popularity. It has its stars followed by thousands of people on social networks, as well as its privileged community platforms that disseminate information that is usually at the discretion of designers. The principle ? Open the files of an alpha, demo or application with different programs to view their contents and reveal their secrets. Titles like Sea of ​​Thieves, Genshin Impact or Fortnite have seen previously unannounced elements on the Internet. But can anyone improvise as a “data miner” without risk? Can we relate this technique, based on the previously mentioned game of Rare, to piracy, which goes against works and thus against the teams that create them?

What are you dating me

They are known by the pseudonyms Ninji, Senescallo, That1MiningGuy, Antediluviana or Im_a_Fucking_Cereal (sic). Your job? Spread your results on Twitter, Discord, and Reddit, which are popular game elements that have not yet been revealed. These range from the upcoming characters from Genshin Impact and the upcoming activities in Animal Crossing: New Horizons to future weapons from Apex Legends. Your method? Data mining or data mining. Data mining is a term that has several meanings. In the professional world, this means that automated techniques – like algorithms – are executed to cross different pieces of information which are then examined. This allows the data analyst to view trends and gain insights, a skill that is highly valued in the age of social media and games as a service. In the field of video games, the term is commonly used to denote the analysis of information found in the code of software or on a public test client by a third party. Data miners use several tools to find lists in a title that contain models, sounds, or even chapters to identify information that has not yet been published. With a bit of luck, you won’t have to dig too deep. In 2014, World of Warcraft Patch 5.4 announced the expansion of Warlords of Draenor via a client added folder called Iron Horde. The add-on Wrath of the Lich King had also hinted in the same way that he would come more or less willingly. If the data is not in the files that can be reached by conventional means, Snoopers will take other, more difficult paths. There is software available that you can use to examine the contents of a .pak compilation and potentially gain access to a lot of information hidden in multiple files. In most cases, sensitive resources are encrypted for obvious reasons of confidentiality. When asked by ActuGaming, That1miningguy explains that he uses DXZTPorter and VPK Extractor to access models as well as images from Apex Legends. The user who “dates” the Resident Evil VII demo to post its content on Reddit stated that the .pak file was protected but the HxD software used the HxD software to open the executable. Regardless of the method used, the mysterious investigator reveals his knowledge gained through the demo and reveals various secrets thanks to the files unearthed. It was no longer necessary for other members to offer a list of chapters in which the events of the game were grouped with great precision based on the information found. By analyzing the code in the demo, we were able to guess the weapons, enemies, scenarios and even the contents of future dlcs. This was also done while reading the Sea of ​​Thieves .pak file, which speeded up the arrival of several items. With a few ideas of programming language, the right software, and most importantly, a lot of patience, it becomes possible to interpret the lines of code to bring out what has not yet been shown. It must be recognized that by naming an animation “kraken_ingestplayer” it is fairly easy to infer that a sea monster is on the agenda and that the latter has the ability to swallow the player.

You are not that apk!

In fact, any sufficiently motivated person with good computer skills can try to extract information from a game (or a demo, an update, a patch) in order to unlock its secrets. It should be noted, however, that the text data found are only binding for those who believe in them. Indeed, it is common to find traces of unfinished developments in these lists. In our example from Resident Evil VII we find, among other things, excerpts from multiplayer experiences from the project as well as the name of a well-known character from the series who ultimately stayed behind the curtain. Same case for Apex Legends with the Smart Pistol case discovered in the code, but which ultimately never left the armory. Data mining is therefore far from being an exact science. And that is quite normal, since it is just a free interpretation of a computer language. While some data miners refer to themselves as code archaeologists and reject any connection between their practice and that of hacking or reverse engineering, publishers tend to view them with suspicion. On the legal side, it is common to believe that there is nothing illegal in searching for information hidden in software. Users who publish the results of their research will even find an echo on specialized websites, which, however, do not highlight classic hacking in their sections. So would it be up to developers to be careful how they protect their code and name their files? “The disclosure of information that is still unknown to the public through the simple analysis of freely available data is not in itself illegal,” explains Nicolas Bressand, lawyer in Lyon, expert on intangible rights, who was contacted on March 12, 2021. He continues, “With data mining, the problem is more with the way the data was retrieved and / or the nature of that data. “Under French law, programs are protected by copyright. In particular, this enables rights holders to prohibit the decompilation of their software works. When data mining involves decompiling software it becomes an illegal practice and decompiling is a violation. In addition to this, data mining can also include bypassing technical safeguards put in place by the publisher to block the game, or even disclosing works that are illegal to reproduce, such as B. game visualizations or music. Nicolas Bressand, lawyer in Lyon, expert in intangible law Some publishers such as Activision Blizzard include a contractual ban on data mining in the paragraphs of their user licenses. “In this case, data mining also represents a violation of the user license and thus a contractual error,” confirms Nicolas Bressand. “It is therefore possible to link the practice of data mining to a large number of violations of legal and / or contractual obligations,” he adds. This therefore means that data miners can theoretically be held liable. In practice, legal action is rare. There was pressure from Niantic on Pokevision, a site that used information from Pokemon GO to find virtual little monsters. But as a rule, the giants remain discreet when it comes to data mining, they did not want to answer our questions on this topic. “Do publishers have an interest in taking action against data miners? I don’t think so, ”warns Nicolas Bressand. “In most cases, the damage suffered by the publisher seems extremely limited. For example, when Red Dead Redemption II’s data mining revealed that a PC version was in development, the publisher likely lost the announcement effect related to that upcoming release, but what were their real prejudices? In video game communities, the damage would likely be much greater than the damage caused by the leaks. Nicolas Bressand, Lyon lawyer, expert in intellectual property law. It should also be borne in mind that the studios that are happy to receive a small free advertisement may become consenting victims of this practice. “I think that for the most part, while publishers can easily characterize violations of their rights, publishers will likely prefer not to take action against data miners. Especially since the revelation of still secret information often results in an emulation around the game and arouses the interest of the players ”, recognizes Nicolas Bressand. He comes to the conclusion: “It could be different if a data miner succeeds in disclosing a large amount of important information at the same time.”

Of public use?

Granted, this information fishing by freshwater sailors is especially talked about when its representatives bring to the surface some important – still kept secret – news about the great games of the moment. However, data mining is not just about wanting to lift the veil on future content. In a game like World of Warcraft, it provides access to extremely detailed information on elements of the universe such as agricultural areas and enemy patterns. In Minecraft, he reveals little stories that are sometimes funny, sometimes strange (even unsettling). In fact, a game’s code contains a lot of details about how it was made. Ehm, Data Miner, explains, “It’s fascinating to take a look at what could be in a game. As with behind-the-scenes coverage of a movie, exploring data from an app gives the impression of going behind the scenes. Unused resources often somehow tell a story about the development of the title. Sometimes you can even find a hidden message from the developer literally letting you know about the game’s development. “Like Gold Hunters at Seven, data miners are rummaging through the code of our games with the excitement of revealing buried content that may emerge. The public’s eyes light up. These 2.0 investigators are considered essential elements by a whole community of gamers and are digging in a gray area that is generally left by the authorities. Enough to let you experience the thrill of piracy in peace without having to indulge in barbaric hacking acts. Under the careful eyes of the editors. By Carnbee, journalist MP